Optimal Transmit-Diversity Precoders for Random Fading Channelsy Georgios B. Giannakis and Shengli Zhou Dept. of ECE, Univ. of Minnesota 200 Union Street SE, Minneapolis, MN 55455 Emails: fgeorgios,
[email protected] Abstract | Optimal transmitter-receiver designs obeying the water- lling principle are well-known and widely applied when the propagation channel is deterministically known and frequently updated at the transmitter. Because the latter may be costly or impossible to acquire in rapidly varying wireless environments, we develop in this paper statistical water- lling-like criteria for stationary random fading channels. The resulting optimal designs require only knowledge of the channel's correlation matrix that does not require frequent updates and can be easily acquired. Applied to a multiple transmit-antenna paradigm, the optimal precoder and loading algorithm outperform not only the conventional equal-power allocation across all antennas but also their deterministic water- lling counterparts in fast fading scenarios.
In this paper, we formulate general criteria to design optimal transmitter precoders based only on the channel's second-order statistics. The resulting optimal precoders are thus applicable to any constellation, any modulation and channel types (e.g., Rayleigh, Rician, Nakagami [6]). This paper is organized as follows: Section II describes the discrete time baseband equivalent system model and Section III develops optimal precoders derived under two dierent criteria. Performance is analyzed in Section IV with numerical results. Conclusions are then drawn on Section V. II. System model
Fig. 1 depicts the block diagram of a transmit diversity system with L transmit antennas. In the ith (i 2 [1; L]) Antenna diversity is well motivated for wireless communications through fading channels. In many cases however Ant. 1 (e.g., in cellular downlink), receive antennas may be expenh1 u1 (n) sive or impractical, which endeavors diversity gain through v (t) c1 Ant. multiple transmit antennas [11], [10], [4]. The diversity x(t) x(n) s^(n) gain at the transmitter reaches within 0.1dB of that at the s(n) Ant. L g receiver with the same number of antennas [11]. uL(n) With deterministically known channel state information cL (CSI) at the transmitter, optimal transmitter design based hL on the water lling principle [3] has been proposed in receiver transmitter channels [1],[7],[8]. However, when the channel is fast fading, it is Fig. 1. discrete time model costly yet not accurate to acquire CSI at the transmitter and the optimal design based on previously acquired information becomes outdated quickly. Therefore, it is mean- branch, the information-bearing signalT s(n) is rst spread :=1[ci (0); : : : ; ci (P , 1)] of length P to obingful to design optimal transmitters by modeling the chan- by the code ci P tain: u ( n ) = i k=,1 s(k )ci (n , kP ). After pulse shaped nel explicitly as a stationary random process. by lter ' ( t ) (not in Fig. 1), the continuous sigSo long as the channel remains stationary, it has invari- nal u (t) = P1 ushown ( n ) ' ( t i c ) is transmitted through n = 1 ant statistics. Through eld measurements, the transmit- the ith antenna, where T is, nT the duration. We here c ter can acquire such statistical CSI a priori. Alternatively, assume that the channels are atchip faded nonthe receiver can estimate channel statistics and feed them selective) with complex fading coecients h(frequency , i = 1 ; : : : ; L. i back to the transmitter on line. In some applications such The received signal in the presence of additive Gaussian as Time Division Duplex (TDD) systems, the transmitter PL can obtain channel statistics directly since the forward and noise v(t) is then given by x(t) = i=1 hi ui (t) + v(t). Afbackward channels share the same physical channel during ter chip level ltering with '(t) that is matched to '(t), dierent time slots. Based on channel covariance informa- x(t) is sampled at t = nTc to yield the discrete time signal tion, optimal transmitter design is proposed in [2] to mini- x(n) := x(t)jt=nTc . Selecting '(t) to possess the Nyquistmize the system Symbol Error Rate (SER). However, [2] is Tc property avoids inter symbol interference, and allows us only applicable to a xed constellation (BPSK), a xed to express x(n) as: 1 X L modulation (dierential encoding), and speci c random X x(n) = hi s(k )ci (n , kP ) + v (n); (1) channels of known probability density function (Rayleigh k=,1 i=1 p.d.f). where v(n) := v(t)jt=nTc . To cast (1) into a convenient y This work was supported by NSF Wireless Initiative grant no. 99-79443. matrix-vector form, we let T denote transpose and de ne I. Introduction
the P 1 vectors x(n) := [x(nP + 0); : : : ; x(nP + P , 1]T , and v(n) := [v(nP + 0); : : : ; v(np + P , 1)]T ; the L 1 vector h := [h1 ; : : : ; hL ]T , and the P L code matrix C := [c1 ; : : : ; cL]. We then can rewrite (1) as x(n) = Chs(n) + v(n). Because we only focus on symbol by symbol detection, we omit the time index n and subsequently obtain: x = Chs + v: (2)
E [x] 6= 0, it suces to substitute h; x by their zero mean counterparts: h0 := h , E [h] and x0 := x , E [x], respectively. Simplifying (4) to h^ = RhxR,xx1 x, we nd using a0) that Rhx = Rhh CH s , Rxx = jsj2 CRhhCH + Rvv , which allow us to write (4) as: ,1 h^ = RhhCHs jsj2 CRhhCH + Rvv x: (6) We will select precoder C to minimize the mean-square error of the channel estimator [c.f. (4),(5)]: At the receiver, the channel estimate h^ is obtained rst to E (C) = tr Rhh , RhxR,xx1RHhx n o enable maximum ratio combining (MRC) using = tr R , j sj2 Rhh CH (jsj2 CRhh CH + Rvv ),1 CRhh hh g^opt := [g(0); : : : ; g(P , 1)]T = Ch^ : (3) n o = tr (R,hh1 + jsj2 CH R,vv1 C),1 ; (7) MRC is known to maximize the signal to noise ratio (SNR) [6] at its output that yields the symbol estimate: where the last step results from the matrix inversion lemma H x = h^ H Cx, where ()H denotes Hermitian trans- [5, p. 565] and tr fg denotes the trace operator. Without ^opt s^ = g any constraint, minimizing E leads to the trivial solution pose. Given a precoder C, (3) speci es the optimal receiver g that requires in nite power to be transmitted (kCk = 1). in the sense of maximizing output SNR. It also shows that A reasonable constraint that takes into account limited power, which the eectiveness of MRC hinges critically on the quality budget resources is the transmitted n o is ex 2 H H ^ ^ pressed as P = tr ( s C ) ( s C ) = tr j s j C C . With of the channel estimate h; i.e., the better h is, the better 0 g^opt will perform. The question that arises is how to select the transmit-power constraint, ourn objectiveobecomes: the precoder C. Once the goodness criterion of h^ has been min E (C) subject to C := tr jsj2 CH C , P0 = 0: C speci ed, we propose to optimize it with respect to C in (8) order to obtain the optimum precoder. To simplify E in (7), we diagonalize Rhh using its spectral III. Optimal precoder designs decomposition: For simplicity, in this paper we adopt the following asRhh = UDh UH ; Dh = diag(11 ; : : : ; LL); (9) sumptions: U is unitary, ii 0 denotes the ith eigenvalue of a0) channel h and noise v are uncorrelated; i.e., E hvH = where R , and diag() stands for a diagonal matrix with speci ed hh 0, where 0 denotes the all-zero matrix. diagonal entries. If the hi 's are uncorrelated, then ii = H 2 a1) noise v is white; i.e., Rvv := E vv = v I, where I h 2i E j h j . Based on (9) and a1), we can rewrite (7) as: denotes identity matrix. i ) ( To obtain the optimal transceivers (C; g) for random 2 j s j channels, one may be temptedhto minimize the mean square E (C ) = tr (UD,h 1 UH + 2 CH C),1 i 2 v error of symbol estimates: E js , s^j , with respect to all ) ( 2 j sj H H possible channel However, it turns out that , 1 , 1 h realizations. i (10) = tr (Dh + 2 U C CU) minimizing E js , s^j2 only yields a trivial solution for v ( ) 2 random channels (recall that [8] models the channel as dej s j := tr (D,h 1 + 2 FH F),1 ; terministic). Once the precoder C is speci ed, the optimal v ^ receiver is given by (3). Therefore, the more accurate h is, H the better the overall system performance. This observa- where F := CU and C = FU is uniquely determined by tion motivates us to search for an optimal C that minimizes F. Because U is unitary, the power constraint is equivachannel estimation error variance. lently written as n o tr jsj2 FH F = P0 : (11) A. Mean-Square Channel Error Criterion H H Letting Rhx := E hx and Rxx := E xx , the linear Equations (10) and (11) imply that only the eigenvalues of 2 , need to be designed in our MMSE estimator for h given x is: FHF, denoted by f112 ; : : : ; fLL h^ = E [h] + RhxR,xx1(x , E [x]); (4) constrained minimization problem. Speci cally, (8) can be rewritten as: L X 1 + jsj2 f 2 ),1 and has a variance matrix: min E ( F ) = ( h i F v2 ii i=1 ii E jh^ , hj2 = Rhh , RhxR,xx1 RHhx: (5) (12) Without loss of generality, we set h and x to have zero mean in our subsequent derivation. Indeed, if E [h] 6= 0,
subject to C := jsj2
L X i=1
fii2 , P0
= 0:
Applying the Lagrange multiplier method, we form the La- Based on (9) and a1), we can simplify (19) to: grangian ,1 UH + CH C jsj2 =2 )UDh UH I ( x ; h j s ) = log ( UD 2 v h L L 2 X X ,1 2 E + C = ( 1 + jsj2 fii2 ),1 + (jsj2 fii2 , P0 ); H CH CU jsj =2 )Dh = log ( D + U 2 v h ii v i=1 i=1 (13) = log2 (D,h 1 + FH F jsj2 =v2 )Dh : (20) and take derivatives with respect to fii2 to obtain: @ (E + C ) , jsj2 ( 1 + jsj2 f 2 ),2 + jsj2 = 0: (14) According to Hadamard's inequality [3, p. 502], max= 2 @fii v2 ii v2 ii imum I (x; hjs) is achieved when the matrix (D,h 1 + FHF jsj2 =v2)Dh is diagonal. Therefore, FHF should be Solving for fii2 from (14), we nd: exactly diagonal and thus it can be written as in (18). p 2 2 We then maximize (20) under our transmit-power con v fii2 = 2 p , 2 v ; (15) straint as: jsj jsj ii
2
L jsj2 D2 D = X jsj2 f 2 ) 2 max I ( x ; h j s )=log I + log (1+ h ii 2 and plugging fii back to the power constraint in (12), we Df v2 f v2 ii
obtain as:
s
v2
L 2 X = L1 (P0 + v ): i=1
ii
(16)
subject to C := jsj2
L X i=1
i=1
fii2 , P0
= 0:
(21)
the Lagrangian I (x; hjs) + C with respect Substituting (16) into (15), we reach the optimal F with Dierentiating to fii2 , and equating it to zero, we obtain: its eigenvalues given by: ! ii jsj2 =v2 @I (x; hjs) + C L 2 2 X 1 P 1 = + jsj2 = 0: 0 v , v : 2 jsj2 = 2 ) fii2 = 2 + (17) @fii2 ln 2(1 + f v ii ii jsj L L i=1 ii ii (22) The only constraint on matrix F (thus C) so far is that From (22), we can solve for FHF should have eigenvalues as in (17). Without aecting the constrained optimization in (8) or (12), we can assume fii2 = ,1=( jsj2 ln 2) , v2 =(jsj2 ii ); (23) H that F F is a diagonal matrix: FHF = D2f ; where Df := diag(f11 ; : : : ; fLL); (18) and plug into (11) to obtain as: X 2 fii 0; 8i 2 [1; L]: , ln1 2 = L1 (P0 + v ): (24) ii i However, if h is also complex Gaussian distributed with zero mean (Rayleigh) or non zero mean (Rician), maxi- Substituting into (23), we arrive at the optimal loading: mization of mutual information between the received vec! tor x and the channel h will constrain FH F to have an L X 1 P 1 v2 v2 0 2 exact diagonal form, as we describe next. f = + , ; (25) B. Conditional Mutual Information Criterion In this section, we will further assume that: a2) channel h is complex Gaussian distributed. Recalling (2) and interchanging the roles of h and s let us view h as the input to the channel sC. We then seek precoders C that minimize the mutual information I (x; hjs) between the Gaussian input h and the output x conditioned on s. Mutual information between Gaussian vectors is well known (see e.g., [1, Thm. 1], [7], and references therein). Lemma 1: Consider the nite-dimensional vector model x = sCh + v, where h and x satisfy a0) and a1). The conditional mutual information between x and h, I (x; hjs), is maximized when h is Gaussian as per a2), and is given by ( jj denotes matrix determinant): 1 H ,1 I (x; hjs) = log2 (R, hh + (sC) Rvv (sC))Rhh : 2 (19)
ii
jsj2
L
L i=1 ii
ii
which is surprisingly identical to (17) that has been derived under a dierent criterion. Therefore, objective (12) also leads to the maximum I (x; hjs) in (21), under assumption a2). C. Loading Algorithm From Sections III-A and III-B, we see that the optimal FHF should be the diagonal matrix D2f , with elements fii2 speci ed in (17) (or (25)). Diagonal FH F implies that the columns of F are orthogonal and allows one to factor F and C as: F = Df ; C = Df UH; (26) where the columns of are orthonormal (note that Df takes care of power loading).
Equation (26) provides a general optimal precoder for random channels for a given transmit-power budget. We summarize this result as follows: Theorem 1: Suppose a0) and a1) hold true and Rhh and P0 be available. The optimum receive- lter gopt is given by (3) and the optimum precoding matrix Copt = Df UH has U and Df formed as in (9), (17) and (18) with an arbitrary P L matrix with orthonormal columns. Optimality in gopt refers to maximum-SNR while optimality in Copt pertains to either minimizing the random channels' estimation error, or, maximizing the conditional mutual information under a2). Note that [2] arrived at the same power loading as in (17) by minimizing the system SER. However, [2] is only applicable to dierential BPSK under Rayleigh fading channels, in which case a simple closed-form expression of SER can be obtained. However, Theorem 1 holds for any constellation and adopted modulation. Under the channel MMSE criterion, Theorem 1 holds regardless of the channel p.d.f. Therefore, [2] falls into the general class of Theorem 1. On the other hand, we can infer from [2] that the optimal precoder C leads to minimum SER for dierential BPSK as well. The entry fii2 0 in (17) imposes the following lower bound on P0 : 2
Lv , P0 > min( )
L X
v2 ii
:= Pth :
practical design with equal power allocated to each antenna and orthogonal spreading codes falls into this category, and results in a diagonal FH F with identical diagonal elements. Special Case 3: If the transmit power is high enough to have P0 v2 =ii ; 8i, then fii2 f112 ; 8i. In this case, power is equally distributed to each branch and thus to each antenna. Therefore, equal power distribution is only optimal when the power is suciently high. IV. Performance Analysis
To obtain a closed-form SER, we here assume that the channel estimates at the receiver are error-free. Since the received vector x can be expressed as x = Df UH hs + v, we can always multiply x with H to obtain: x~ := Hx = Df UH hs + Hv := h~ s + v~ ; (28) where h~ and v~ henceforth denote equivalent channel and noise vectors. Matrix Rh~ h~ = Df UH RhhUDf = D2f Dh implies that the entries of h~ are independent, while v~ is still white since Rv~v~ = v2 I. For MRC symbol estimates s^ = h~ Hx~, it is possible to obtain a closed form SER expression for MPSK signals [9, eq. (44)] as: L 1 Z (M ,1)=M Y P (E ) = I ( f 2 = 2 ; ; )d; (29) s
0
i=1
i
ii ii
v
P SK
(27) where P SK := sin2 (=M ), and Ii (x; P SK ; ) is the moment of the p.d.f of h~ i evaluated at , P SK = sin2 () (see [9, i=1 eq. (24)]). SER for other constellations such as QAM can If P0 is not large enough to aord optimal power allocation, be easily carried out as in [9]. i.e., P0 < Pth , we will set the minimum fii to zero, and Unlike [2] that relies on dierential BPSK signaling to load power on the remaining L , 1 summands in (17). obtain a simple SER closed form, our approach provides a The power loading algorithm is summarized in the fol- general framework regardless of the constellation involved lowing steps: or the modulation adopted. Due to space limitations, we 1) Arrange ii in decreasing order 11 22 LL . For only present simulations for QPSK (M = 4) and adopt = 1; : : : ; L, calculate PL := Pth from (27) based only on the same 3-channel set up as in [2], i.e., = 1, = L 11 22 the rst L channel eigenvalues: 11 ; : : : ; L L . 0 :05, 33 = 0:01 with normalized noise power v2 = 1. If 2) With the power budget P0 in the interval [Pi ; Pi+1 ], set hi 's are independent, the physical channel will have h2 = i fi+1;i+1 ; : : : ; fLL = 0, and obtain f11 ; : : : ; fii according to ii ; 8i. However, this setting corresponds also to correlated (17) based only on 11 ; : : : ; ii . channels with the same variance and correlation coecients We now examine several special cases of the general pre- 12 = 23 = 0:94, and 13 = 0:86 [2]. coder form in (26). We de ne the SNR as the total transmitted power diSpecial Case 1: If the channel coecients are uncorrelated, vided by noise power: P0 = 2 . Fig. 2 shows the optimal then U = I and C = Df ; thus, any orthogonal matrix power allocation among the vdierent h~ i 's. In low power, together with the optimal power loading matrix Df are the transmitter prefers to null certain h~ i's, while it approxiequivalent in terms of optimizing (8). This intuitive fact mately equates power to all antennas in high power to benis widely applied in practice through orthogonal spreading e t from diversity. The SER curve in Fig. 3 con rms that sequences. To fully exploit diversity gains oered by L an- the optimal allocation outperforms the conventional equal tennas, P L is required. However, we observe that the allocation as well as the selective power allocation choice P , L 0 gains nothing more in terms of optimizing power which corresponds to simply transmitting on a few strong of (8) (or (20)) than the minimum choice of P = L which ~h 's. minimizes bandwidth requirements, or, increases informa- i Next, we check robustness of the loading algorithm with tion rate. The simplest will be the identity matrix, which respect to nite-sample eects that introduce estimation in practice corresponds to allowing only one antenna trans- error in the channel covariance. At the receiver (or at the mission per time slot. If fii = 0, the ith physical antenna transmitter in a TDD mode), Rxx can be estimated by the will be turned o to avoid power loss. ^ xx = PNn=1 x(n)x(n)H =N , from which sample average R Special Case 2: If we equi-distribute power among all branches, then Df = f11 I and the precoder C = f11 UH we obtain R^ hh = Cy(R^ xx , v2 I)(CH )y; (30) will have orthogonal columns with identical norms. The ii
1
30
10
25
10
0
equate power to 3 branches −1
10
first branch Symbol Error Rate
Distributed power (dB)
20
15 equal power in 3 branches
−2
10
use only the best branch
−3
10
10
second branch
5
Optimal power allocation
−4
10
third branch
−5
0
10
0
5
10
15 20 Total Power (dB)
25
30
35
5
10
15
20 SNR
25
30
35
Fig. 3. Symbol Error Rate vs SNR
Fig. 2. Optimal loading vs. equal power allocation
where y stands for matrix pseudo-inverse. At the beginning, we assume that the transmitter does not have CSI and employs equal power allocation to all 3 antennas. We then estimate the channel covariance matrix at the receiver using N symbols. This statistical CSI is then fed back to the transmitter to initialize power allocation. Here we use the same channel setting as before, and rely on the simplest precoding matrix = ILL. To obtain SER in closed form, we again assume that the channel is known at the receiver and apply (29). In Fig. 4, we observe that the channel covariance estimates based on only N = 10 symbols lead to a system performance close to the ideal case (perfectly known Rhh), and outperform the equal power loading considerably. As N increases to 20, the power allocation algorithm based on R^ hh is indistinguishable from Fig. 4. Power allocation using R^ hh the ideal case. Therefore, power allocation based on channel covariance estimates is robust to estimation errors. The References latter implies that the optimal transceivers designed with [1] N. Al-Dhahir and J. M. Cio, \Block transmission over dispersive channels: Transmit lter optimization and realization, and statistical CSI oer an attractive choice in fast fading apMMSE-DFE receiver performance," IEEE Trans. on Informaplications. tion Theory, vol. 42, no. 1, pp. 137{160, Jan. 1996. 0
10
power allocation based on covariance matrix estimates using 10 symbols
−1
Symbol Error Rate
10
equal power to 3 branches
power allocation based on covariance matrix estimates using 20 symbols
−2
10
optimal power allocation
−3
10
[2] [3] In this paper, we have proposed optimal precoder designs [4]
V. Conclusions
for multiple transmit-antennas and a single receive-antenna system based only on channel covariance information. The optimal designs are applicable to any signal constellation and adopted modulation in random fading channels of unknown p.d.f. In rapidly fading environments they outperform existing approaches that require exact channel information. Conventional equal power allocation falls into our general precoder class and is approximately optimal when the transmitted power goes to in nity. Simulation results con rm the superiority of optimal loading with respect to conventional equal power distribution among transmitantennas.
[5] [6] [7] [8] [9] [10]
Acknowledgments
The authors would like to thank professor M.-S. Alouini of the ECE Dept. at the Univ. of Minnesota for pointing their attention to [2] and [9].
[11]
5
10
15 SNR
20
25
J. K. Cavers, \Optimized use of diversity modes in transmitter diversity systems," in Proc. of the VTC., 1999, pp. 1768{1773. T. M. Cover and J. A. Thomas, Elements of Information Theory, New York: Wiley, 1991. J. C. Guey, M. P. Fitz, M. R. Bell, and W. Y. Kuo, \Signal design for transmitter diversity wireless communication systems over rayleigh fading channels," IEEE Trans. on Comm., vol. 47, no. 4, pp. 527{537, Apr. 1999. S. Haykin, Adaptive Filter Theory, Prentice-Hall, Inc., 3rd edition, 1996. J. Proakis, Digital Communications, McGraw-Hill, 1989. A. Scaglione, S. Barbarossa, and G. B. Giannakis, \Filterbank transceivers optimizing information rate in block transmissions over dispersive channels," IEEE Trans. on Information Theory, vol. 45, Apr. 1999. A. Scaglione, G. B. Giannakis, and S. Barbarossa, \Redundant lterbank precoders and equalizers Part I: Uni cation and optimal designs," IEEE Trans. on SP, pp. 1988{2006, July 1999. M. K. Simon and M.-S. Alouini, \A uni ed approach to the performance analysis of digital communication over generalized fading channels," Proc. of the IEEE, pp. 1860{1877, Sept. 1998. V. Tarokh, N. Seshadri, and A. R. Calderbank, \Space-time codes for high data rate wireless communication: performance criterion and code construction," IEEE Trans. on Information Theory, vol. 44, no. 2, pp. 744{765, Mar. 1998. J. H. Winters, \The diversity gain of transmit diversity in wireless systems with rayleigh fading," IEEE Trans. on Vehicular Tech., vol. 47, no. 1, pp. 119{123, Feb. 1998.