1
Transmitter Optimization and Optimality of Beamforming for Multiple Antenna Systems with Imperfe t Feedba k Syed Ali Jafar, Andrea Goldsmith
Department of Ele tri al Engineering Stanford University, Stanford, CA 94305, USA
f
g
syed, andrea wsl.stanford.edu
Abstra t
We solve the transmitter optimization problem and determine a ne essary and suÆ ient ondition under whi h beamforming a hieves Shannon apa ity in a narrowband point to point ommuni ation system employing multiple transmit and re eive antennas. We assume perfe t hannel state information at the re eiver (CSIR) and imperfe t hannel state feedba k from the re eiver to the transmitter. We onsider the ases of mean and ovarian e feedba k. The hannel is modeled at the transmitter as a matrix of omplex jointly Gaussian random variables with either a zero mean and a known ovarian e matrix ( ovarian e feedba k), or a non-zero mean and a white ovarian e matrix (mean feedba k). For both ases we develop a ne essary and suÆ ient ondition for when the Shannon apa ity is a hieved through beamforming, i.e. the hannel
an be treated like a s alar hannel and one-dimensional odes an be used to a hieve apa ity. We also provide a waterpouring interpretation of our results and nd that less hannel un ertainty not only in reases the system apa ity but may also allow this higher apa ity to be a hieved with s alar odes whi h involves signi antly less omplexity in pra ti e than ve tor oding.
2 I. Introdu tion
Re ently there has been mu h interest in the apa ity of multiple antenna systems with partial hannel state information at the transmitter (CSIT). It has been found that unlike single antenna systems where exploiting CSIT does not signi antly enhan e the Shannon apa ity [1℄, for multiple antenna systems the
apa ity improvement through even partial CSIT an be substantial. Key work on apa ity of su h systems by Telatar [2℄, Madhow and Visotsky [3℄ and Trott and Narula [4℄[5℄ has provided some interesting results and lead to new questions. For any given input ovarian e matrix the input distribution that a hieves the Shannon apa ity is shown in [2℄ to be omplex ve tor Gaussian, mainly be ause the ve tor Gaussian distribution maximizes the entropy for a given ovarian e matrix. This leads to the transmitter optimization problem - i.e., nding the optimum input ovarian e matrix to maximize apa ity subje t to a transmit power (tra e of input ovarian e matrix) onstraint. The optimum ovarian e matrix in general an be a full rank matrix whi h implies ve tor oding a ross the antenna array. Limiting the rank of the input ovarian e matrix to unity (beamforming) essentially leads to a s alar oded system whi h has a signi antly lower
omplexity for typi al array sizes. The distin tion between s alar and ve tor oding is further elu idated in Se tion III. For a system using a single re eive antenna and multiple transmit antennas, the transmitter optimization problem was re ently solved by Visotsky and Madhow in [3℄ for the ases of mean feedba k and ovarian e feedba k. Moreover, their numeri al results indi ate that beamforming is lose to the optimal strategy when the quality of feedba k improves, i.e. the hannel un ertainty de reases (mean feedba k) or a stronger hannel mode an be identi ed ( ovarian e feedba k). For mean feedba k, Narula and Trott [4℄ point out that there are ases where the apa ity is a tually a hieved via beamforming. While they do not obtain fully general ne essary and suÆ ient onditions for when beamforming is a apa ity a hieving strategy, they develop partial answers to the problem for two transmit antennas. Note that both [3℄ and [4℄ assume a single re eive antenna. In this paper we onsider system models with multiple transmit and re eive antennas. For ases of both mean and ovarian e feedba k we solve the transmitter optimization problem. The solutions obtained by Visotsky and Madhow [3℄ for a single re eive antenna an be obtained as spe ial ases of our results. Also, for
3
both mean and ovarian e feedba k, and with multiple transmit and re eive antennas, we develop a ne essary and suÆ ient ondition under whi h beamforming is a apa ity a hieving strategy. The organization of this paper is as follows. The next se tion ontains our system model and problem statement. In parti ular, both mean feedba k and ovarian e feedba k ases are des ribed. The notions of s alar oding and ve tor oding are elaborated in Se tion III. Se tion IV provides a brief summary of previously known results due to Visotsky and Madhow [3℄ and Narula and Trott [4℄[5℄. Our solutions to the problems of transmitter optimization and optimality of beamforming under ovarian e feedba k are provided in Se tions V and VI respe tively. Under mean feedba k our solutions to the problems of transmitter optimization and optimality of beamforming are ontained in Se tions VII and VIII respe tively. In se tion IX we use the analyti al results derived in previous se tions to obtain numeri al results for some spe i examples. Finally, we on lude with a summary of our work in Se tion X. II. System Model and Problem Statement
We use the following notation: x N~ (; 2 ) implies that x is a omplex ir ularly symmetri Gaussian with mean and varian e 2 . Zi and Zj represent, respe tively, the ith row and the j th olumn of a matrix
Z . Tr[A℄ is the tra e of the matrix A. E[x℄ denotes the expe tation of random variable x. Lastly, s.t. is short for subje t to. We fo us on a point-to-point ommuni ation system using nT transmit and nR re eive antennas over a narrowband at fading hannel. The hannel matrix is represented as H = [hij ℄nR nT where hij N~ (0; ij2 ) represents the hannel gain from transmit antenna j to re eive antenna i (Rayleigh Fading). It is assumed that the hannel is perfe tly known to the re eiver while the transmitter has only partial information based on a limited hannel state feedba k from the re eiver. Note that, as in [3℄, the feedba k hannel is assumed to be free from noise but may in orporate some delay. The nT dimensional ve tor symbol x( ) = [x1 ( ), x2 ( ), , xnT ( )℄T is transmitted at time instant to yield the nR dimensional re eived ve tor r( ) as
r( ) = H ( )x( ) + n( ):
(1)
The nR omponents of AWGN n( ) are assumed to be i.i.d. N~ (0; 2 ) and un orrelated with the signals. To
4
keep the notation simple the time index will be dropped in this paper. Re all that in general for partial information U at the transmitter, odes need to be de ned over an extended alphabet of fun tions
U ! X where X is the input alphabet. However when the CSIT is a deterministi
fun tion of the CSIR optimal odes an be onstru ted dire tly over the input alphabet X [6℄. For our ase sin e the re eiver knows the hannel perfe tly and feedba k hannel is noise free, the CSIT is a deterministi fun tion of the CSIR and the apa ity is easily shown to be
C=
Q
max C (Q); : tra e(Q)=P
(2)
where
C (Q) , E
log InR
HQH y + 2
(3)
is the apa ity with the input ovarian e matrix E[xxy ℄ = Q. As shown in [3℄, the apa ity C (Q) is a hieved by transmitting independent omplex ir ular Gaussian symbols along the eigenve tors of Q. The powers allo ated to ea h eigenve tor are given by the eigenvalues of Q. Thus the apa ity optimization problem involves nding the optimum Q to maximize C (Q) subje t to the transmit power onstraint tra e(Q)= P . Consistent with [3℄ and [5℄, we de ne beamforming as a transmission strategy where the input ovarian e matrix Q has rank one. Given the feedba k, the entries of the hannel matrix H are modeled as omplex jointly Gaussian random variables at the transmitter. As in [3℄ we onsider two extreme ases: mean feedba k, in whi h the hannel state information at the transmitter is the mean of the distribution with the ovarian e modeled as white, and ovarian e feedba k, in whi h the hannel is assumed to be varying too rapidly to tra k its mean, so that the mean is set to zero and the information regarding the relative geometry of the propagation paths is
aptured by a nonwhite ovarian e matrix. These two ases are des ribed in further detail as follows. A. Covarian e Feedba k
Under the ovarian e feedba k model, we assume that the entries in a given row of H are orrelated while those belonging to dierent rows of H are un orrelated. More spe i ally we assume that the rows of H are i.i.d. while the olumns are orrelated. This is typi al of the hannel orrelations obtained using the
5
'one-ring' model employed by Shiu et.
al.
in [7℄. The model is a ray tra ing model appropriate for a s enario
where the base station (BS) is unobstru ted and the subs riber unit (SU) is surrounded by lo al s atterers. A detailed des ription of the model is presented in [7℄. For our purpose it suÆ es to point out the following two features of the orrelations obtained using the one ring model: 1. E[hik hil ℄ = E[hjk hjl ℄ for 1 i; j nR ; 1 k; l nT . So the rows of H are identi ally distributed. 2. The s atterers surrounding the SU impose random phase shifts onto the waves in ident upon them, de orrelating the fades asso iated with any two distin t antennas at the SU. Thus, for a wavelength and an angle spread (typi al values range from 0.6 degrees to 15 degrees), the minimum de orrelating antenna spa ing at the re eiver is just 0:38, while at the transmitter it is 0.38 1 for a broadside transmitting antenna array and 1.53 2 for an inline transmitting antenna array. For GHz frequen y operation, these
orrespond to antenna spa ings on the order of entimeters for the re eive antenna array, meters for broadside transmit antennas and hundreds of meters for inline transmit antennas. So pra ti al systems demonstrate strong orrelation between fades asso iated with dierent transmit antennas on the downlink while the fades asso iated with dierent re eive antennas are fairly un orrelated. This justi es our assumption that the rows of H are un orrelated. Sin e they are jointly Gaussian, they are also independent. Mathemati ally, under ovarian e feedba k, the hannel H is modeled at the transmitter as 1
H = Hw 2
(4)
where Hw is an nR nT matrix of i.i.d., zero mean, unit varian e omplex Gaussian random variables and is a ovarian e matrix of fade orrelations orresponding to dierent transmit antennas, revealed to the transmitter via the feedba k hannel. Note that while we fo us only on transmitter fade orrelations in this paper, Jorswie k and Bo he [8℄ have found that the te hniques developed here an also be used to solve the more general ase of orrelated fades at both transmit and re eive antennas, modeled as 1
1
H = r2 Hw t2 ;
(5)
where r and t now represents the fade orrelations orresponding to the re eive and transmit antennas, respe tively.
6 B. Mean Feedba k
Our mean feedba k model is based on the ve tor extension of the well known Ri ean model used in [9℄. Under this model the hannel H is modeled at the transmitter as r
H=
K H + K + 1 sp
r
1 H : K +1 w
(6)
Here, Hw is a matrix of i.i.d. zero mean, unit varian e omplex Gaussian random variables that represents s attering, Hsp represents the spe ular omponent, and K is the Ri ean fa tor. The spe ular omponent, in turn, is given by
Hsp = at aTr
(7)
with at and ar the spe ular array response ve tors at the transmitter and re eiver, respe tively. By normalizing the transmit power a
ordingly we an represent the hannel model at the transmitter under mean feedba k as
H = H + Hw ;
(8)
where H is the mean feedba k (deterministi ) representing the spe ular omponent and Hw is the white
omplex Gaussian matrix representing the s attering. For our purpose it suÆ es to note that H is a unit rank matrix. For the ases of mean feedba k and ovarian e feedba k as des ribed above, we wish to solve the following two problems. Transmitter Optimization : Chara terize the optimal input ovarian e matrix Qo to a hieve apa ity. Optimality of Beamforming : Determine a ne essary and suÆ ient ondition for optimality of beam-
forming. In other words we wish to determine a ne essary and suÆ ient ondition under whi h the input
ovarian e matrix Qo has unit rank, i.e. the hannel is ee tively a SISO hannel. This is signi ant sin e there are known s alar ode that a hieve near apa ity rates on SISO hannels (e.g. turbo odes). No su h pra ti al odes have been designed for ve tor hannels under partial hannel state information at the transmitter.
7
Note that the hannel distribution parameters (Kt ; H ) may hange over time. However, sin e the hannel distribution hanges mu h more slowly than the hannel itself, we assume lo al stationarity and are interested in the apa ity C for ea h given, xed hannel distribution. In general one an de ne a sto hasti model for the parameters themselves and adapt the transmit power P to the hannel distribution. On e the optimum power adaptation is determined, this would still require solving the problem stated earlier for ea h P (H ) or
P (Kt ). Moreover numeri al results in [1℄ show that adapting the transmit power to the feedba k results in little performan e gain as ompared to keeping the transmit power onstant. III. S alar Coding vs Ve tor Coding
A des ription of s alar and ve tor oding strategies and the asso iated omplexity tradeos is given in [4℄. However, we in lude this dis ussion here to keep this paper as self- ontained as possible and to highlight
ertain aspe ts besides those mentioned in [4℄. Ve tor oding refers to fully un onstrained signaling s hemes for the memoryless ve tor-input ve tor-output power limited Gaussian hannel. Every symbol period, a
hannel use orresponds to the transmission of a ve tor symbol omprised of the inputs to ea h transmit antenna. A ve tor odeword is an ordered sequen e of these ve tor symbols. Mathemati ally, a ve tor
odeword spanning N hannel uses an be represented as a nR N matrix with the rows and olumns des ribing the progression of spa e and time respe tively. Ideally, while de oding ve tor odewords the re iever needs to take into a
ount the dependen ies in both spa e and time dimensions and therefore the omplexity of ve tor de oding grows exponentially in the number of transmit antennas. While optimal de oding is seldom used in pra ti al systems, even suboptimal ve tor de oding strategies that do not lose signi antly on the
apa ity are not known in general. Note that the ve tor oding strategy an also orrespond to several s alar
odewords being transmitted in parallel. In fa t any input ovarian e matrix, regardless of its rank, an be treated as several s alar odewords en oded independently at the transmitter and de oded su
essively at the re eiver by subtra ting out the ontribution from previously de oded odewords at ea h stage. This is seen as follows. Suppose the apa ity a hieving input ovarian e matrix for (2) is given by the spe tral de omposition Qo = U o o U oy .
8
Then the apa ity an be expressed as,
C = = = = = =
HQH y E log InR + 2 # " nT o X i HUoi Uoiy H y E log InR + 2 i=1 # #) " ( " j 1 o j o nT X X X i HUoi Uoiy H y i HUoi Uoiy H y E log InR + E log InR + 2 2 i=1 i=1 j =1 2 3 ! 1 o o oy y j 1 o nT X X j HUj Uj H 5 i HUoi Uoiy H y 4 E log InR + InR + 2 2 i=1 j =1 2 0 1 3 ! 1 j 1 o nT o o U oy H y X X HU j i i i E 4log 1 + U oy H y In + HU o A5
j =1 nT X j =1
2 j
R
j
2
i=1
Rj :
(9) (10) (11) (12) (13) (14)
where (13) follows from (12) using the property det(I + AB ) = det(I + BA). Note that 13 2 0 ! 1 j 1 o o y o o y X i HUi Ui H HUoj A5 Rj = E 4log 1 + j2 Uojy H y InR + 2 i=1
(15)
is the rate arried by the j th s alar odeword and the odewords are de oded at the re eiver in des ending order, i.e. odeword 1 is de oded last and odeword nT is de oded rst. Here Uoj is the beamforming ve tor that maps ea h s alar symbol of the j th odeword onto the transmit antenna array, oj are the eigenvalues of Qo whi h determine the relative power distribution among the s alar odewords, and 1 o oy y 2 o P is the whitening lter used at the re eiver to whiten the olored interferen e InR + ji=11 i HUi2Ui H seen by the j th s alar odeword from the odewords that have not been de oded yet, i.e. odewords 1 through
j 1, treated as additive Gaussian noise. Thus the ee tive hannel seen by the j th s alar odeword is given by the single input multiple output hannel
Hje = InR +
j 1 o X HU o U oy H y i=1
i
i i 2
! 21
HUoj :
(16)
Finally, mat hed ltering at the re eiver with Hjey yields the s alar hannel seen by the j th odeword as ! 1 j 1 o X i HUoi Uoiy H y 1 oy y HUoj ; (17) hj = 2 Uj H InR + 2 i=1 and the rate supported by the j th odeword is given by
Rj = E[log(1 + oj hj )℄
(18)
9
whi h is the same expression as (15). Hen e, we see that any input ovarian e matrix orresponding to a ve tor oding strategy an also be realized with several independent s alar odewords transmitted simultaneously and de oded su
essively. However, well known problems asso iated with su
essive de oding and interferen e subtra tion, e.g. error propagation, render this approa h unsuitable for use in pra ti al systems. In this paper we use s alar oding to des ribe a single beamforming strategy, i.e. when the input ovarian e matrix has unit rank and the multiple input multiple output hannel an be transformed into a single input single output hannel as des ribed above. Thus the well established s alar ode te hnology an be used to approa h apa ity and sin e there is only one beam, interferen e an ellation is not needed. IV. Previous Work
The solutions to the transmitter optimization problem and the previous results on optimality of beamforming presented in [3℄ and [4℄ are summarized here. Note that these results assume a single re eive antenna. To make that distin tion lear we represent the hannel as the ve tor h instead of the matrix H that we
onsider for our hannel models. A. Transmitter Optimization - Covarian e Feedba k
For ovarian e feedba k h N~ (0; ). Then, as shown in [3℄, the maximizing ovarian e matrix Qo and the
hannel ovarian e matrix have the same eigenve tors. So the spe tral de ompositions an be expressed as
Qo = U oQ Uy and = U Uy . The optimal strategy is to employ independent omplex ir ular Gaussian inputs along the eigenve tors of . oQ is a diagonal matrix whose elements Q i need to be determined through numeri al maximization te hniques subje t to the tra e onstraint. B. Transmitter Optimization - Mean Feedba k
For mean feedba k h N~ (; I). Then, as shown in [3℄, the maximizing ovarian e matrix Qo has a spe tral de omposition Qo = Uo o Uo y , where the rst olumn of the unitary matrix Uo is given by Uo [1℄ = jjjj , and
Uo [2℄ Uo [nT ℄ are arbitrarily hosen, ex ept for the restri tion that the olumns of Uo are orthonormal. o Furthermore, the eigenvalues o2 = = omin , o , where o = PnT 11 .
10 C. Optimality of Beamforming and Quality of Feedba k
y
With perfe t feedba k, i.e. h N~ (; 0), the apa ity a hieving input ovarian e matrix Qo = P y has rank one and therefore beamforming a hieves apa ity [4℄. With no feedba k, i.e. h N~ (0; I), the optimum ovarian e matrix derived by Telatar in [2℄, Qo = nPT I, has full rank. So beamforming is not optimal. For the spe ial ase h N~ (0; ), it is shown in [4℄ that beamforming in the dire tion orresponding to the largest eigenvalue of the hannel ovarian e matrix is asymptoti ally optimum as the SNR tends to zero. Numeri al results in [3℄ indi ate that beamforming be omes the optimal strategy as the quality of feedba k( ) improves under mean feedba k or if there is a strong hannel mode present under ovarian e feedba k. V. Transmitter Optimization Under Covarian e Feedba k
The following theorem hara terizes the optimal transmit strategy under ovarian e feedba k. 1
Theorem 1: For the ovarian e feedba k model des ribed in Se tion II-A, i.e. when H = Hw 2 , the input
ovarian e matrix Qo that maximizes the apa ity expression in (2) has the same eigenve tors as the hannel
ovarian e matrix . (The orresponding eigenvalues an be determined through numeri al optimization te hniques subje t to the tra e onstraint. See [11℄ for a detailed treatment of this numeri al optimization problem.) To prove Theorem 1, let the eigende omposition of be given as = U Uy where U is a unitary matrix and is a diagonal matrix ontaining the eigenvalues of arranged in de reasing order. We assume that has full rank so that 11
22 nT nT > 0. Our goal is to show that the optimal input
ovarian e matrix has a spe tral de omposition Q = U Q Uy . Equivalently we wish to show that for the
optimal Q the matrix Uy QU is diagonal. Note that the optimum Q may not have full rank, as is typi al of a water lling solution when there is insuÆ ient water and therefore some modes are left dry. De ne the matrix 1
Z , HU 2 :
(19)
1 So the ith row of Z is given as Zi = Hi U 2 . Sin e the rows of H are zero mean and i.i.d., so are the 1
rows of Z . Also, sin e U 2 is the whitening lter for the random ve tor Hi , the ovarian e matrix of
11
Zi is given by the identity matrix. Thus Z onsists of i.i.d. zero mean and unit varian e omplex Gaussian 1 elements. Substituting ba k H = Z 2 Uy in the apa ity expression (2) we get log InR
"
C = max E Q
# 1 1 Z 2 Uy QU 2 Z y + 2
s. t. tra e(Q) = P: We de ne Q^
Uy QU =
1
1
, 2 Uy QU 2 .
(20)
Note that sin e U is unitary the non-negative de nite matri es Q and
1 2Q ^ 12
have the same set of eigenvalues. Therefore we have 1 1 tra e(Q) = tra e(Uy QU ) = tra e( 2 Q^ 2 );
(21)
and the apa ity expression (20) an be rewritten as log InR
"
C=
Q^
E max1 1 : tra e( 2 Q^ 2 )=P
#
^ y Z QZ + 2 ;
where the rows of Z are independent and identi ally distributed as Zi
(22)
N~ (0; I ). As stated earlier, our
1 aim is to show that this optimal Uy QU is a diagonal matrix. Now, sin e 2 is a diagonal matrix, this 1 1 is equivalent to showing that the optimal Q^ = 2 Uy QU 2 is a diagonal matrix. Next we prove that the
optimal Q^ that maximizes the apa ity in (22) is indeed diagonal. Let the optimal Q^ have a spe tral de omposition Q^ = U^ ^ U^ y where as before U^ is a unitary matrix and ^ is a diagonal matrix of eigenvalues arranged in de reasing order. Sin e ea h element of Z is i.i.d. and zero mean and U^ is unitary it is easy to see that Z U^
Z , i.e. Z U^ and Z are identi ally distributed. This implies
that "
E
log InR
#
"
#
^ y Z ^ Z y Z QZ + 2 = E log InR + 2
So the diagonal matrix ^ a hieves the same apa ity as the optimal Q^ . However note that Q^ needs to satisfy the additional onstraint given by 1
1
tra e( 2 Q^ 2 ) = P:
(23)
12
^ Now To omplete the proof we need to show that if Q^ satis es the tra e onstraint, then so does . tra e(
1 2Q ^ 12 )
=
1
1
^ 2) = and tra e( 2
nT X q^ii
;
(24)
i=1 ii nT ^ X ii ; i=1 ii
(25)
^ , ^ and respe tively. where q^ii , ^ ii and ii are the diagonal elements of Q; We need the following lemmas in order statisti s: Lemma 1: For a Hermitian matrix A the ve tor of diagonal entries faii g
majorizes
the ve tor of eigen-
values fAii g. Proof:
This is Theorem 4.3.26 in [12℄.
Re all that a real ve tor = [i ℄
2 R n majorizes another real ve tor = [ i ℄ 2
R n if and only if the
sum of the k smallest entries of is greater than or equal to the sum of the k smallest entries of for
k = 1; 2; ; n 1 and the sums of the entries of and are equal. This is a mathemati al way to apture the vague notion that the omponents of a ve tor are \less spread out" or \more nearly equal" than are the
omponents of a ve tor . Majorization is the pre ise relationship between the eigenvalues and the diagonal entries of a Hermitian matrix. That is, for any real ve tor that majorizes another real ve tor there exists a Hermitian matrix with its main diagonal given by and its eigenvalues given by . Lemma 2: For any two given positive real ve tors = [i ℄; = [ i ℄ 2 R n+ the permutation that mini-
mizes the sum
Pn
(i) i=1 i
is su h that (i) and i are in the same order. That is, 8i; j 2 f1; 2; ; ng; if (i)