Weighted Sum-Rate Maximization using ... - Stanford University

Report 1 Downloads 76 Views
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

1

Weighted Sum-Rate Maximization using Weighted MMSE for MIMO-BC Beamforming Design Søren Skovgaard Christensen, Rajiv Agarwal, Elisabeth de Carvalho, and John M. Cioffi Abstract—This paper studies linear transmit filter design for Weighted Sum-Rate (WSR) maximization in the Multiple Input Multiple Output Broadcast Channel (MIMO-BC). The problem of finding the optimal transmit filter is non-convex and intractable to solve using low complexity methods. Motivated by recent results highlighting the relationship between mutual information and Minimum Mean Square Error (MMSE), this paper establishes a relationship between weighted sum-rate and weighted MMSE in the MIMO-BC. The relationship is used to propose two low complexity algorithms for finding a local weighted sum-rate optimum based on alternating optimization. Numerical results studying sum-rate show that the proposed algorithms achieve high performance with few iterations. Index Terms—MIMO systems, transceiver design, smart antennas, antennas and propagation.

I. I NTRODUCTION

M

IMO systems have great potential to achieve high throughput in wireless systems [1]. In cellular systems, multiple antennas can easily be deployed at the base station to enhance the system capacity. When Channel State Information (CSI) is available at the transmitter, the base station can transmit to multiple users simultaneously to achieve a linear increase of system throughput in the number of transmit antennas. This can be done using linear or non-linear transmission techniques. For the Multiple Input Multiple Output Broadcast Channel (MIMO-BC), non-linear techniques have been shown to outperform linear techniques and achieve channel capacity. The capacity-achieving downlink strategy is non-linear and uses Dirty Paper Coding (DPC) [2]. However, practical techniques to implement DPC [3], [4], [5], are in preliminary states of development and are difficult to implement in practice because of their high computational burden. This makes linear downlink transmission techniques (also called beamforming) an attractive alternative because of their simplicity [6], [7], [8], [9]. Transmit beamforming design entails finding the linear transmit filter, through which the data intended for the different users is passed before transmission on the channel. This paper focuses on transmit beamforming design to maximize Weighted Sum-Rate (WSR) subject to a transmitpower constraint, which is a non-convex and non-trivial problem. WSR is useful for prioritizing different users and thus finds different practical applications. For instance the weights Manuscript received August 1, 2007; revised March 16, 2008 and October 27, 2008; accepted November 4, 2008. The associate editor coordinating the review of this paper and approving it for publication was K. Wong. S. S. Christensen is with Nokia Denmark, Modem Algorithm Design, Copenhagen, Denmark (e-mail: [email protected]). R. Agarwal and J. M. Cioffi are with Stanford University (e-mail: {rajivag, cioffi}@stanford.edu). E. de Carvalho is with the Department of Electronic Systems, Aalborg University, Aalborg, Denmark (e-mail: [email protected]). Digital Object Identifier 10.1109/T-WC.2008.070851

can be chosen according to the state of the packet queues corresponding to a max-stability service [10] or, by using equal weights, to maximize sum-rate corresponding to a best effort service. A recent paper [11] studies the same problem and proposes an iterative algorithm based on uplink-downlink Mean Square Error (MSE)-duality. From a given starting point, the algorithm converges to a local WSR-optimum. The principle in the algorithm is to iterate between the downlink system and a virtual uplink system in order to update filter structures, in addition to solving a Geometric Program (GP) for optimizing the transmit power distribution. In another recent paper [12], the authors attempt to solve the WSR problem using concepts from [8], however their algorithm is a 4-step iterative algorithm, two of which require solving a GP, which again is iterative and a Second-Order Cone Program (SOCP) respectively. This paper takes a different approach to solving the WSRproblem which leads to an iterative algorithm that is guaranteed to converge to a local WSR-optimum. In the same line as recent results highlighting the relationship between information theoretic quantities (mutual information) and MMSE in single user Multiple Input Multiple Output (MIMO) channels [13], [14], we have established a relationship between WSR and Weighted sum-Minimum Mean Square Error (WMMSE) in the MIMO-BC. By comparing the gradients of resp. WSR and WMMSE cost functions we are able to show a simple relationship between the Karush-Kuhn-Tucker (KKT) conditions of the two problems. Essentially we show that the WSRproblem can be solved as a WMMSE-problem with optimized MSE-weights. Using the derived correspondence we propose an iterative algorithm for WSR-optimization in the MIMO-BC. The algorithm iterates between WMMSE transmit filter design, MMSE receive filter computation using well-known closedform expressions and weighting matrix update. Each of the three steps is solved by evaluating closed-form expressions, and the proposed algorithm is less complex than state-of-theart methods [11],[12] requiring multiple-level iterations. Two versions of the algorithm are given. In the first one, the weight matrix is computed based on the correspondence between WSR and WMMSE. In the second one, the weight matrix is additionally constrained to be diagonal which is shown to lead to a WSR-optimum with decorrelated streams at each user. Numerical results comparing convergence rates and sumrate performance to other recently proposed algorithms are presented. Notation: mij denotes the (i, j)th entry of the matrix M. MT /MH /Tr (M) denotes transpose/conjugate transpose/trace of a vector/matrix M. The dimension of a matrix M is denoted

c 2008 IEEE 1536-1276/08$25.00 

2

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

by the subscript M[Q×P ] , where Q is the row dimension. IK denotes an identity matrix of size K × K. ||v|| denotes Euclidean-norm of a vector v. E [·] denotes statistical expectation. II. S YSTEM M ODEL AND M AIN O BJECTIVE A general narrowband point-to-multipoint MIMO system is considered. There are P transmit antennas and K users, each with Q receive antennas. The system has in total QK receive antennas across all users1 . The MIMO channel between the transmitter and user k is described by a matrix Hk ∈ C[Q×P ] containing complex-valued channel gains of the different antenna-pairs. The signal observed at user k at sample time n can be represented by the complex vector yk (n) = Hk x(n) + vk (n),

(1)

[P ×1]

where x(n) ∈ C is the complex-valued transmitted vector, and vk (n) ∈ C[Q×1] is a noise vector containing circularly symmetric white2 Gaussian noise with covariance  H Rvk vk = E vk (n)vk (n) = IQ . The transmit vector x(n) is a linearly filtered version of the input data vectors d1 (n), · · · , dK (n) ∈ C[Q×1] : x(n) =

K 

(2)

Bk dk (n).

k=1

The matrices B1 , · · · , BK ∈ C[P ×Q] are the transmit filters (beamformers). It is assumed that each user has Q parallel data streams, although some of the streams can have a rate of zero. Additionally it is assumed that each user  receives (n) = IQ . The Q independent streams such that E dk (n)dH k transmit vectors respect a total block power constraint for a block consisting of N transmissions i.e. N 1  H x (n)x(n) ≤ Etx . N n=1

(3)

Throughout the analysis it is assumed that N is large such N    that N1 xH (n)x(n) can be replaced by E xH (n)x(n) = n=1 H  k Tr Bk Bk . It is also assumed that the channel changes in a quasi-static manner, and hence the channel matrices H1 , · · · , HK are constant for the duration of the block. Furthermore it is assumed that CSI, i.e. H1 , · · · , HK is perfectly known at the transmitter.

where uRk ≥ 0 and Rk defines respectively a weight and the rate for the k th user. Without loss of generality we have used power constraint rather than the often  K an equality H ≤ Etx because the WSR optimum used k=1 Tr Bk Bk is reached at maximum transmit power. Assuming Gaussian signaling, the achievable rate for user k is given as  H −1 Rk = log det Ik + BH (5) k Hk Rv ˜k v ˜k Hk Bk , where Rv˜k v˜k denotes the effective noise covariance matrix at user k: K  H H k Bi BH (6) Rv˜k v˜k = Ik + i Hk . i=1,i=k

We notice that Rk can be expressed as a function of the error covariance matrix after MMSE receive filtering. The MMSEreceive filter at user k is given as:   AMMSE = arg min E ||Ak yk − dk ||2 k Ak

=

H BH k Hk

B1 ,··· ,BK

s.t.

K 

k

 Tr Bk BH k = Etx ,

k=1 1 Without loss of generality each user is assumed to have Q receive antennas. Users with fewer antennas can be emulated by nulling the corresponding channel gains. 2 The noise covariance matrix R vv can be assumed to be white without loss of generality after an appropriate whitening transform on the channel matrix.

(7)

The MSE-matrix for user k given that the MMSE-receive filter is applied can be written as [15]:   MMSE H  Ek = E AMMSE y − d y − d A k k k k k k  −1 H H −1 = Ik + Bk Hk Rv˜k v˜k Hk Bk . (8) We refer to Ek as the MMSE-matrix. Given (5) and (8) the rate for user k can be written as:  Rk = log det E−1 . (9) k III. G RADIENT E XPRESSIONS AND KKT C ONDITIONS This section first studies the gradient for the WSR maximization problem. Next, the gradient for a new optimization problem, WMMSE, is computed and we are able to show that there is a simple relationship between the KKT conditions of the two problems. A. Gradient of Weighted Sum-Rate Maximization To investigate stationary points of the problem (4) we formulate the Lagrangian:

   −uRk Rk +λ Tr Bk BH f (B1 , · · · , BK ) = k − Etx . k

A. Main objective The main objective is to find the transmit filters B1 · · · BK which maximize the weighted sum-rate. This can be written as the minimization problem:  , · · · , BWSR = arg min −uRk Rk (4) [BWSR 1 K ]

 −1 H H k Bk BH . ˜k v ˜k k Hk + Rv

k

(12) ∂f We define ∇Bk f = ∂B ∗ as the complex gradient operator. k The gradient is a matrix with the [n, m]th element defined as: [∇Bk f ]nm = ∇[Bk ]nm f = ∂[B∂f . From the KKT ∗] k nm conditions a local optimum must satisfy for all k: ∇Bk f = 0, and ∇λ f = 0. Next, the gradient of Rk and Ri , i = k w.r.t. the are transmit filter Bk is computed. The computations  partly based on the result [16]: ∇ log det X = Tr X −1 ∇X , where X is a matrix. First, ∇Bk Rk is computed. We have: ∇[Bk ]nm E−1 = k H −1 em eH H R H B , where e is a unity-vector with one n n k v ˜k v ˜k k k at the nth element and zeros elsewhere. So:  H −1 ∇[Bk ]nm Rk = Tr Ek em eH n Hk Rv ˜k v ˜k Hk Bk =

H −1 eH n Hk Rv ˜k v ˜k Hk Bk Ek em .

(13)

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

⎛ ∇Bk f

=

−1 ⎝ −uk HH k Rv ˜k v ˜k Hk Bk Ek +



K 

i=1,i=k



∇Bk g

=

−1 ⎝ −HH k Rv ˜k v ˜k Hk Bk Ek Wk Ek +

−1 H H −1 ⎠ Bk + λBk u i HH i Rv ˜i v ˜i v ˜i Hi Bi Ei Bi Hi Rv ˜i Hi

K 

i=1,i=k

As ∇[Bk ]nm Rk = [∇Bk Rk ]nm we conclude: −1 ∇Bk Rk = HH k Rv ˜k v ˜k Hk Bk Ek .

(14)

∇[Bk ]nm Ri   −1 H = Tr Ei BH i Hi ∇[Bk ]nm Rv ˜i v ˜i Hi Bi    −1 H H H −1 = Tr Ei BH i Hi −Rv ˜i v ˜i v ˜i Hi Bk em en Hi Rv ˜i Hi Bi (15)

So, we obtain: −1 H H −1 ∇Bk Ri = −HH i Rv ˜i v ˜i v ˜i Hi Bi Ei Bi Hi Rv ˜i Hi Bk .

(10)

⎞ −1 H H −1 ⎠ Bk + λBk HH i Rv ˜i v ˜i v ˜i Hi Bi Ei Wi Ei Bi Hi Rv ˜i Hi

(11)

C. Comparison of WSR and WMMSE gradients

−1 Next, ∇BkRi , i = k is computed. −1 We have: ∇[Bk ]nm Rv˜i v˜i −1 = −Rv˜i v˜i ∇[Bk ]nm Rv˜i v˜i Rv˜i v˜i and ∇[Bk ]nm Rv˜i v˜i = H H i Bk e m e H n Hi . So:

H −1 H H −1 = −eH n Hi Rv ˜i v ˜i v ˜i Hi Bi Ei Bi Hi Rv ˜i Hi Bk em .

3

(16)

The results (14) and (16) are finally used to obtain the WSRgradient expression of equation (10). B. Gradient of Weighted Minimum Mean Square Error Minimization The new optimization problem we investigate is the WMMSE transmit filter design problem assuming that MMSE receive filtering is applied:  , · · · , BWMMSE ] = arg min Tr (Wk Ek ) [BWMMSE 1 K

Comparing (10) and (11) it is clear that the two problems are closely related. In fact, for a given set of transmit filters B1 , · · · , BK and corresponding MMSE-matrices E1 , · · · , EK , the WMMSE-gradient can be made identical to the WSR-gradient, if the following MSE-weights are selected for all k: (21) Wk = uk E−1 k . Consider next that we have a WSR optimal point, i.e. where ∀k∇Bk f = 0, ∇λ f = 0, and that the set of transmit filters and corresponding MMSE-matrices at this point are resp.: BWSR , · · · , BWSR and EWSR , · · · , EWSR 1 1 K K . If this set of MMSEmatrices is used to compute a set of MSE-weights according to (21), then the KKT-conditions for the WMMSE-problem are satisfied for the same set of transmit filters, i.e. the point = BWSR ∀k. is also a WMMSE optimal point with BWMMSE k k The fact that the KKT-conditions of the two problems can be satisfied simultaneously suggests that it is possible to solve the WSR-problem through the use of WMMSE and a proper set of MSE-weights. IV. A LTERNATING O PTIMIZATION FOR F INDING A L OCAL WSR-O PTIMUM

The basic idea is to alternate between WMMSE optimization of B1 , · · · , BK and the MSE weight update for W1 , · · · , WK based on (21). If this iterative process converges, it converges to a fixed point, which is also a stationary point of the WSR-objective function. B1 ,··· ,BK   k There are various ways to implement such an alternating s.t. Tr Bk BH optimization process and in particular, the weight matrices k = Etx .(17) k can be updated at different stages. As described above, one possibility is to update the MSE-weights after each update of The matrix Wk ∈ C[Qk ×Qk ] is a constant weight matrix the transmit filters. This method has been tested and gives associated with user k. The Lagrangian reads: good performance. The method we tested requires an inner

    loop to perform WMMSE optimization over B1 , · · · , BK g(B1 · · · BK ) = Tr (Wk Ek ) + λ Tr Bk BHk − Etx . for fixed MSE-weights. Specifically, the inner loop performs k k (18) alternating optimization between MMSE-receive filters and The gradient ∇Bk g is computed is a similar manner as WMMSE transmit filters according to the method described in ∇Bk f by considering the different parts of the summation. [17]. Overall, the disadvantage of this method is the number of required iterations, since inner iterations have to be performed Firstly, for each weight update. Fortunately, it turns out that the H −1 (19) inner iterations are not needed to obtain an algorithm which ∇Bk Tr (Wk Ek ) = −Hk Rv˜k v˜k Hk Bk Ek Wk Ek . converges to a local WSR-optimum. Therefore we propose an algorithm which contains a single loop where respecSecondly, tively MSE-weights and transmit/receive filters are updated. −1 H H −1 R H B E W E B H R H B . ∇Bk Tr (Wi Ei ) = HH In summary, the proposed algorithm, which we coin Weighted i i i i i i k i i i v ˜i v v ˜i v ˜i ˜i (20) Sum-Rate maximization Beamforming using Weighted sumCombining (19) and (20), we obtain the WMMSE-gradient Minimum Mean Square Error (WSRBF-WMMSE) is: expression (11). Notice also that ∇λ f = ∇λ g =  ofHequation  − E Tr B B . k tx k k

4

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

P ROPOSED A LGORITHM : WSRBF-WMMSE set n = 0 set Bnk = Binit k ∀k iterate update n = n + 1 ∀i for all k using (7) I. compute Ank |Bn−1 i II. compute Wkn |Bn−1 ∀i for all k using (21),(8) i III. compute Bn |An , Wn using (22),(23) until convergence The first step updates the MMSE-receive filters given the transmit filters from previous iteration. The second step updates the MSE-weights given the transmit filters as a function of the MMSE-matrix. The third step computes the WMMSE transmit filters given the receive filters and MSE-weights. The problem of computing the MMSE transmit filter B[P ×QK] = [B1 , · · · , BK ] for fixed receive filters was treated in [18] for the unweighed MMSE-case, but the extension to WMMSE is straightforward. The WMMSE transmit filter structure is computed as:  Tr WAAH ¯ = (HH AH WAH + IP )−1 HH AH W, (22) B Etx = diag{W1 , · · · , WK } and where W[QK×QK] A[QK×QK] = diag{A1 , · · · , AK } are block-diagonal T  matrices, and H[QK×P ] = H1 T , · · · , HK T contains the different channel matrices stacked row-wise. The transmit filter is then computed as

where b =



¯ BWMMSE = bB, Etx ¯B ¯ H) Tr(B

(23)

is a gain factor which scales the signal

so as to satisfy the transmit power constraint. Following [18], BWMMSE is derived under the assumption that all receive filters are rescaled by 1b . A. Convergence analysis Following the reasoning of section III-C it is clear that if the proposed algorithm is initialized by BWSR ∀k, then k after one iteration it will return the optimal filter again, ∀k. Convergence in the general case is i.e. Bnk = BWSR k proven by proving monotonic convergence of an equivalent optimization problem, which is based on expanding the cost  to include the MSE-weights and receive −uRk log det E−1 k filters as optimization variables in addition to the transmit filters.   ˜ k = E (Ak yk − dk ) (Ak yk − dk )H , the We define E MSE-matrix. Note that it is different from Ek defined in (8). Consider the following cost function:   ˜lk (Wk , Ak , Bi ∀i) = Tr Wk E ˜k (24)  −1 −uRk log det uRk Wk − uRk Q. and the following objective: [BWSR ∀k] k

= s.t.



˜

lk (Wk , Ak , Bi ∀i) Bk ∀k,Ak ∀k,Wk ∀k k   Tr Bk BH (25) k ≤ Etx . k arg min

We first prove that the optimization w.r.t. the transmit filters Bk using this criterion is the same as the original WSR optimization (4). First we minimize ˜lk (Wk , Ak , Bi ∀i) w.r.t. Ak considering weights and transmit filters fixed. The minimizing value is unique and is denoted as AMMSE (Bi ∀i) (see eq. (7)). k ˜ The variable Ak intervene only in Ek and substituting by ˜ k becomes equal to Ek . Hence, we get a AMMSE (Bi ∀i), E k new cost function for the transmit filters and the weights:  lk (Wk , Bi ∀i) = Tr (Wk Ek )−uRk log det u−1 Rk Wk −uRk Q. (26) Minimizing lk (Wk , Bi ∀i) w.r.t. Wk leads to Wkmin (Bi ∀i) = min uRk E−1 k (Bi ∀i). Substituting Wk in lk (W  k, B i ∀i) by Wk we get the cost function −uRk log det E−1 , which correk sponds to the original WSR-cost. Now, prove that alternating minimization of the  we ˜lk (·) in (25) corresponds to the steps I,II,III of cost k WSRBF-WMMSE. When W k ∀k is constant, the cost function   ˜ is k Tr Wk Ek (Ak , Bi ∀i) , so in the alternating minimiza-

tion process: finding Ak given Bn−1 ∀i gives the same result i as in step I and finding Bk ∀k given Ank ∀k and Wkn ∀k gives the same results as step III. Optimization w.r.t. Wk with ˜ −1 (An , Bn−1 ∀i) = Ak and Bi ∀i fixed gives Wk = uRk E i k k −1 n−1 Ek (Bi ∀i). So we have same result as step II. to the alternating minimization process, the cost Due ˜ k lk (·) decreases monotonically. Since the WSR-value given a power constraint is upper bounded, the cost (25) is lower bounded (by the negative WSR-maximum). We conclude that we have convergence to a local minimum. Note that the original WSR-cost does not necessarily experience a monotonic convergence, although simulation results show that this is often the case. V. A LGORITHM WITH D IAGONAL W EIGHTING M ATRIX L EADING TO D IAGONAL MMSE-M ATRICES The algorithm proposed in the previous section leads to non-diagonal MMSE-matrices in general. This means that the receiver for each user with Q antennas needs to separate Q correlated streams, i.e. for user k, joint decoding  of Q streams . is needed in order to realize the rate log det E−1 k Consider a case where the MMSE-matrix is diagonal, i.e. the streams are decorrelated. In this case it suffices to  decode = , log det E−1 the streams separately, since for diagonal E k k  −1 q log(ek,q ), where ek,q = Ek[q,q] denotes the MMSE of the q th stream belonging to user k. Clearly from a receiver complexity point of view it would be tractable if Ek would always be diagonal, such that joint decoding is not needed. This section first shows that for any set of transmit filters B1 , · · · , BK which leads to a certain rate-tuple, it is possible ˜ K with the same rate˜ 1, · · · , B to generate a modified set B tuple, but achieved with diagonal MMSE-matrices at all users. This result can be applied to the WSR maximizing point meaning that an equivalent optimal solution exists that leads to a diagonal MMSE-matrices. Based on this result, we propose an algorithm where the weighting matrices updated at each iteration are constrained to be diagonal (by setting the off diagonal terms to zero). At convergence, this algorithm leads to a local optimum

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

corresponding to diagonal MMSE-matrices for each user. The main advantage of this algorithm compared to the one previously described is a decreased decoding complexity for each user. A. Existence of equivalent rate-tuples with diagonal MMSEmatrices ˜ k = Bk Q k , First, define a modified transmit filter as B H where Qk is some unitary matrix, i.e. Qk Qk = Qk QH k = I. Using the modified filter will not change the rate-tuple. First, the rate for user k will not change:  H −1 (27) Rk = log det Ik + BH k Hk Rv ˜k v ˜k Hk Bk  H H −1 H = log det Ik + Bk Hk Rv˜k v˜k Hk Bk Qk Qk (28)  H H −1 = log det Ik + QH k Bk H k R v ˜k v ˜k Hk Bk Qk (29)   ˜k . ˜ H HH R−1 Hk B (30) = log det Ik + B k k v ˜k v ˜k (28) comes from the equality Qk QH k = I. (29) comes from the identity det(I + AB) = det(I + BA). Secondly the rates of all other users will not change since their effective noise covariances will not change. Specifically the effective noise covariances of other users depend on the H H ˜ ˜H outer product Bk BH k = Bk Q k Q k Bk = Bk Bk . Suppose we pick Qk = Vk , where Vk is given by the H −1 eigenvalue decomposition Vk Λk VkH = BH k Hk Rv ˜k v ˜k Hk Bk . ˜ The MSE-matrix associated to Bk given by (8) is: E−1 k

˜k ˜ H HH R−1 Hk B Ik + B k k v ˜k v ˜k

(31)

=

Ik +

(32)

=

Ik + Λk .

=

H −1 VkH BH k Hk Rv ˜k v ˜k Hk Bk Vk

(33)

−1

Therefore, Ek = (Ik + Λk ) is diagonal. In summary, for any set of transmit filters B1 , · · · , BK it is possible to generate a diagonalizing set which has the same rate-tuple. Therefore this can also be done at the WSR-optimal point. B. Equivalent optimization criterion The previous section proved the existence of a WSRoptimal point having  diagonal MMSE-matrices. In general ≥ q log(e−1 log det E−1 k k,q ), but equality is achieved when Ek is diagonal. Following the diagonalization arguments from previous section:  log e−1 log det E−1 ˜k v ˜k ) = max ˜k v ˜k ) , k (Bk , Rv k,q (Bk Qk , Rv Qk

q

(34) where Qk is constrained to be a unitary matrix. The maximizing value is Qk = Vk . A WSR-point with diagonal MMSE matrices can therefore be found as the solution to the minimization problem:  ∀k] = arg min −uRk log e−1 [BWSRDIAG k k,q Bk ,Qk ∀k k

s.t.

q

 H k Tr Bk Bk ≤ Etx . Bk = Bk Qk ∀k



(35)

Clearly the matrix Bk Qk can be considered as a single variable under optimization and the problem can then be

5

formulated as: ∀k] [BWSRDIAG k

= s.t.

arg min Bk ∀k



 k

q

−uRk log e−1 k,q

 Tr Bk BH k ≤ Etx .

(36)

k

If we now consider deriving gradients for the beamforming vectors for each stream, i.e. bk,q corresponding to the q th column of Bk we can show a similar relationship between WSR and WMMSE objectives as in section III. In this case the weight-expression (21) would become: wk,q = uk e−1 k,q .

(37)

This suggests an algorithm where Wk is updated as the diagonal matrix −1 Wk = uk diag{e−1 k,1 , · · · , ek,Q }.

(38)

C. Algorithm with diagonal weighting matrix Compared to the previous algorithm described in section IV, the MMSE transmit/receive filter computation steps do not change and thus the only change compared to WSRBFWMMSE is step II: P ROPOSED A LGORITHM : WSRBF-WMMSE-D set n = 0 set Bnk = Binit k ∀k iterate update n = n + 1 ∀i for all k using (7) I. compute Ank |Bn−1 i II. compute Wkn |Bn−1 ∀i for all k using (38),(8) i III. compute Bn |An , Wn using (22),(23) until convergence Notice that when Q = 1, the two algorithms are identical. Convergence is proven in a similar manner as in section IV-A   ˜ k,q (·), by using instead of the cost k ˜lk (·), a cost k q m where m ˜ k,q (wk,q , aH ˜k,q k,q , Bi ∀i) = wk,q e

(39)  −1 −uRk log uRk wk,q − uRk .

th ˜k . ˜k,q = E Here aH [q,q] k,q denotes the q row of Ak and e Furthermore, as in section IV-A, a fixed point is also a stationary point of (36). Thus, if the algorithm is appropriately initialized, then, the fixed point will also be a (global) solution of (36) with diagonal WMMSE matrices and thus also of (4).

D. Diagonalizing structure of the WMMSE solution for diagonal MSE weight matrices The algorithm WSRBF-WMMSE-D converges to a WSRoptimum as a WMMSE-optimum with an optimized set of diagonal MSE-weight matrices Wk ∀k. As argued this algorithm converges to a solution where the MMSE-matrices Ek ∀k are diagonal. For completeness, this section shows that for any set of diagonal MSE-weight matrices, the WMMSE-optimum will have diagonal MMSE-matrices, and thus not only for the diagonal weight matrices achieved at a fixed point of our algorithm.

We use the WMMSE gradient expression (11) and consider the stationary point of the WMMSE solution ∇Bk g = 0∀k. First, define M = K satisfying H −1 H H −1 ˜i v ˜i v ˜i Hi Bi Ei Wi Ei Bi Hi Rv ˜i Hi , and then i=1,i=k Hi Rv H multiply (11) by Bk from left: H −1 H H −BH k Hk Rv v ˜k Hk Bk Ek Wk Ek + Bk MBk + λBk Bk  ˜k−1 H ⇔ − Ek − Ik Ek Wk Ek + BH k MBk + λBk Bk

=0 = 0.

This means that H W k E k = E k W k E k + BH k MBk + λBk Bk .

(40)

Considering (40) it is clear that the term Wk Ek must be Hermitian since it equals a quantity (right hand side) that is Hermitian. For the case where Wk contains real (distinct) diagonal elements, we conclude that the term Ek is also real diagonal. This is a similar argument as used for the filter structure in the Single User Multiple Input Multiple Output (SU-MIMO) case [14, Appendix B]. The result also holds for the case where Wk has repeated diagonal elements. As done in [14] this can be shown by exploiting the fact that Ek is a continuous function of Wk combined with the usage of a perturbation matrix ΔWk that ensures that the diagonal entries of Wk are distinct. Therefore in the limit limΔWk →0 Ek (Wk + ΔWk ) = Ek (Wk ), and hence Ek (Wk ) can be diagonal also where Wk has repeated diagonal elements. Notice though that for unweighted MMSE optimization Ek need not be diagonal as discussed in [17]. VI. N UMERICAL E XAMPLES This section evaluates sum-rate performance for the MIMO downlink for different system settings using Monte-Carlo simulation. In all simulations the number of transmit antennas is fixed to P = 4. The elements of the channel matrices are generated as i.i.d. Gaussian random variables CN (0, σh2 ) and the receive noise covariances are normalized, i.e. Rvk vk = IQ . Since both noise and data covariance are normalized, we define SNR as σh2 . A. Convergence Properties First we study the convergence properties by comparing WSRBF-WMMSE to the recently proposed A LGORITHM 1 of [11] which also converges to a local WSR-optimum. For both algorithms we choose a simple initialization by the = bHH transmit matched filter, i.e. ∀k Binit k , where b is k selected so as to satisfy the transmit power constraint. Sumrate convergence results for four different scenarios are shown in Figure 1. Overall the plots indicate that convergence speed is comparable for the two methods, although the convergence speed varies for the individual channel realizations. For some channel realizations WSRBF-WMMSE appears to converge slightly faster than A LGORITHM 1 of [11] and vice versa for other channel realizations. From a complexity point of view the proposed algorithm has the advantage that it does not require solving a GP in each iteration as A LGORITHM 1

Sum-rate [bits/comp. dim.]

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

Sum-rate [bits/comp. dim.]

6

35

K = 4, Q = 2

30 25 20

K = 2, Q = 2

15 10

WSRBF-WMMSE WSRBF-WMMSE-D A LGORITHM 1 [11]

5 0

0

5

10

15

Iteration number n

20

35

25

30

K = 8, Q = 1

30 25 20

K = 4, Q = 1

15 10

WSRBF-WMMSE WSRBF-WMMSE-D A LGORITHM 1 [11]

5 0

0

5

10

15

Iteration number n

20

25

30

Fig. 1. Convergence properties for 1 (randomly selected) channel realization with SNR = 10 dB with P = 4 transmit antennas, and with different combinations of users K and receive antennas Q.

of [11]. Solving a GP3 has a worst-case polynomial time complexity in the number of variables (KQ) [11]. B. Performance In the following three simulations we consider sum-rate performance in different MIMO-BC scenarios. Since the WSRproblem is non-convex, the initialization Binit k ∀k determines if the WSR-optimum obtained after iterations will be local or global. Currently it is unknown how to choose the initialization such that the global optimum is guaranteed. In our simulations we use two versions of WSRBF-WMMSE: 1) Using 10 random filter initializations and selecting the one leading to the highest WSR value, and 2) Using the simple transmit matched filter initialization and allowing only 10 iterations. The first version is chosen to obtain a high performing solution although there is no guarantee that the global optimum is found. The second version is chosen to study the performance of a potentially practical low complexity solution. The presented results are averaged over 1000 channel realizations. DPC sumcapacity reference bounds are produced using A LGORITHM 2 from [19]. In the first two simulations the number of users is varied as K = {4, 20} and each user has a single receive antenna. In the third simulation the number of users is varied as K = {1, 2, 6} and each user has two receive antennas. 1) Fully loaded case: Figure 2 shows the average sum-rate performance for a number of users K = 4. As performance reference we have used a Zero-Forcing Beamforming (ZFBF)based algorithm [9]. This algorithm tries all combinations of scheduled users, and computes the ZF-filter with the optimal power levels (waterfilling) for each combination. The combination with highest sum-rate is selected. The plots show a marginal improvement of WSRBF-WMMSE1 as compared to the ZFBF-based algorithm. WSRBF-WMMSE2 sees a loss 3 We have used Matlab and CVX to solve the GP: ”http://www.stanford.edu/ boyd/cvx/”. Generating the curves for Figure 1 took less than a second for WSRBF-WMMSE, whereas it took approximately one hour for A LGORITHM 1 of [11].

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

sum-rate [bits / complex dim.]

35

45

DPC capacity bound WSRBF-WMMSE1 (convergence/10 random init) WSRBF-WMMSE2 (10 iterations - TxMF init) ZFBF optimal user sel. + waterfill [9] Unweighted-MMSE [17] Transmit Matched Filter (TxMF)

40

sum-rate [bits / complex dim.]

40

30

25

20

15

10

5

0 −10

sum-rate [bits / complex dim.]

40

−5

0

5

10

SNR [dB]

15

20

25

30

DPC capacity bound WSRBF-WMMSE2 (10 iterations - TxMF init) ZFBF with Semi-orthogonal User Selection [9] Unweighted-MMSE [17] Transmit Matched Filter (TxMF)

35

30

25

20

15

10

5

0 −10

DPC capacity bound WSRBF-WMMSE1 (convergence/10 random init) WSRBF-WMMSE2 (10 iterations-TxMF init)

35

30

K=6

25

K=2

20

15

K=1

10

5

Fig. 2. Fully loaded case: sum-rate performance averaged over 1000 random channels with P = 4 transmit antennas and K = 4 single receive antenna users (Q = 1).

45

7

−5

0

5

10

SNR [dB]

15

20

25

30

Fig. 3. Overloaded case: sum-rate performance averaged over 1000 random channels with P = 4 transmit antennas and K = 20 single receive antenna users (Q = 1).

at high SNR’s but performs well at SNR’s up to 15 dB. In the simulations, it was noticed that at low SNR, WSRBFWMMSE allocates all transmit power to the user with the best channel. This phenomenon is similar to selection of the best user in the single-antenna degraded-broadcast channel to maximize sum-capacity [20]. As the SNR increases, more users are gradually supported simultaneously. To illustrate the role of the MSE-weights we have included the unweighted MMSE beamformer which is also computed using alternating optimization [17]. As seen by Figure 2 and in the following Figure 3 the selection of weights at each iteration is crucial. 2) Overloaded Case: Figure 3 shows the sum-rate performance when the system has 20 users. Due to the system size, it is considered impractical to use several random initializations and we focus only on the algorithm version with fixed initialization. As a reference we have instead used the ZFBF with Semi-orthogonal User Selection (SUS) algorithm

0 −10

−5

0

5

10

SNR [dB]

15

20

25

30

Fig. 4. Sum-rate performance averaged over 1000 random channels with P = 4 transmit antennas and K = {1, 2, 6} users with dual receive antennas (Q = 2).

[9] which uses a simplified procedure for finding a good subset of the users since exhaustive search is considered impractical4 . Finding the best subset of the 20 users is a nontrivial problem, but the simple initialization by the transmit matched filter combined with only 10 iterations (WSRBFWMMSE2) finds a good solution. Compared to the ZFBFSUS algorithm, WSRBF-WMMSE2 clearly performs better in spite of its low complexity. At high Signal-to-Noise Ratio (SNR), WSRBF-WMMSE2 allocates non-zero rates to four users, corresponding to the rank of the channel. In this way WSRBF-WMMSE2 selects 4 of 20 users, rather than attempting to transmit data to all 20 users (in which case the signals that cannot be resolved are treated as interference). The user selection is done automatically by the algorithm which nulls out some users through the weight update. In general the algorithm finds the subset of the users which has a good combination of having high channel gains while simultaneously being spatially compatible (nearly orthogonal). In contrast, the unweighted MMSE-solution (without initial user selection) transmits data to more than 4 users simultaneously which results in interference limitation at high SNR’s. 3) Performance w.r.t. number of users: Figure 4 shows the sum-rate performance with a varying number of users where each user has Q = 2 receive antennas. For K = 1, i.e. the single user case, WSRBF-WMMSE1 achieves capacity (achieved by waterfilling over channel singular values). The single user problem is convex and therefore the transmit filter initialization does not matter. For K = {2, 6}, WSRBFWMMSE1 obtains a slope comparable to the DPC capacity bounds with a loss on the order of 1-2 dB at high SNR. WSRBF-WMMSE2 performs equally well up to ≈10 dB, but is degraded by ≈1dB at high SNR. Notice that the ZF-based solutions [9] are developed only for single antenna receivers and are therefore not included in the plot. 4 For the ZFBF-SUS algorithm [9], the optimal value of the semiorthogonality angle α, used in the SUS procedure, was found.

8

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 12, DECEMBER 2008

SNR=-10dB

0.6

0.4

3

R2 [bits/comp. dim.]

R2 [bits/comp. dim.]

0.8

SNR=0dB

2.5 2

1.5

0.2

1

0.5

0

0

0.2

0.4

0.6

R1 [bits/comp. dim.]

0

1

2

3

R1 [bits/comp. dim.]

4

R2 [bits/comp. dim.]

15

R2 [bits/comp. dim.]

8

SNR=10dB

6

R EFERENCES

SNR=20dB

10

4

2

0

0

0.8

DPC cap. region WSRBF-WMMSE1 0

2

4

6

R1 [bits/comp. dim.]

8

5

0

0

5

10

R1 [bits/comp. dim.]

designs were proposed for finding a local weighted sum-rate optimum. Numerical results studying sum-rate show that the proposed algorithms achieve high performance, even when initiated by the simple transmit matched filter and allowing only few iterations. The algorithms are therefore potential candidates for practical low complexity transmit beamforming implementations.

15

Fig. 5. Achievable rate region for WSRBF-WMMSE1 vs. DPC capacity region for a randomly generated channel with P = 4 transmit antennas, Q = 2 receive antennas and K = 2 users.

4) Achievable rate region: Finally we investigate the achievable rate region of WSRBF-WMMSE1 for a random channel realization in a two-user case. The DPC capacity region is used as reference and is generated using the iterative algorithm presented in [21]. The achievable rate region of WSRBF-WMMSE1 is approximated by first computing rate pairs for a set of different rate weights. The resulting rate pairs correspond to a set of points that lie on the boundary of the achievable rate region. Finally, these points can be connected so as to approximate the achievable rate region. Specifically, the rate weight for user 1 is fixed as uR1 = 1, while the rate weight for user 2 is varied as uR2 = 10[−1,−0.95,··· ,0.95,1] corresponding to a total of 41 different rate weights. Only the ratio between uR1 and uR2 is important, i.e. their absolute level is irrelevant. Figure 5 shows the rate region at different SNR’s for the same channel realization. Consider first the low SNR case with SNR=-10 dB. The proposed beamformer, with the 41 different weight pairs tested, converges to only few different points. For most weight pairs, the algorithm converges to the 2 corner points: only one user receives data, while the other is nulled out. Comparing to the DPC region it can be seen that the beamforming region achieved by WSRBF-WMMSE1 is nearly optimal. As the SNR increases, the gain of DPC over the proposed beamforming algorithm increases. The general observation though is that for the given channel realization, WSRBF-WMMSE1 achieves a fairly large part of the capacity region. VII. C ONCLUSION This paper studied beamforming design for the MIMOBC to maximize weighted sum-rate. The paper found its motivation in recent results highlighting a relationship between mutual information and MMSE, and established a simple relation between weighted sum-rate and weighted MMSE in the MIMO-BC. As a result, two simple alternating optimization algorithms based on well-known transmit/receive MMSE-

[1] E. Telatar, “Capacity of multi-antenna Gaussian channels,” Euro. Trans. Telecomm., vol. 10, no. 6, pp. 585–596, Nov. 1999. [2] H. Weingarten, Y. Steinberg, and S. Shamai, “The capacity region of the gaussian multiple-input multiple-output broadcast channel,” IEEE Trans. Inform. Theory, vol. 52, no. 9, pp. 3936–3964, Sept. 2006. [3] U. Erez, S. Shamai, and R. Zamir, “Capacity and lattice-strategies for cancelling known interference,” in Proc. ISITA 2000, Honolulu Hawaii USA, Nov. 2000, pp. 681–684. [4] W. Yu and J. Cioffi, “Sum capacity of Gaussian vector broadcast channels,” IEEE Trans. Inform. Theory, vol. 50, no. 9, pp. 1875–1892, Sept. 2004. [5] J. Kusuma and K. Ramchandran, “Communicating by cosets and applications to broadcast,” in Proc. CISS, Mar. 2002. [6] M. Schubert and H. Boche, “Iterative multiuser uplink and downlink beamforming under SINR constraints,” IEEE Trans. Signal Processing, vol. 53, no. 7, pp. 2324–2334, July 2005. [7] W. Yu and T. Lan, “Transmitter optimization for the multi-antenna downlink with per-antenna power constraints,” IEEE Trans. Signal Processing, vol. 55, no. 6, part 1, pp. 2646–2660, June 2007. [8] A. Wiesel, Y. Eldar, and S. Shamai, “Linear precoding via conic optimization for fixed mimo receivers,” IEEE Trans. Signal Processing, vol. 54, no. 1, pp. 161–176, Jan. 2006. [9] T. Yoo and A. Goldsmith, “On the optimality of multi-antenna broadcast scheduling using zero-forcing beamforming,” IEEE J. Select. Areas Commun., special issue on 4G wireless systems, vol. 24, no. 3, pp. 528–541, Mar. 2006. [10] M. Kobayashi and G. Caire, “An iterative water-filling algorithm for maximum weighted sum-rate of gaussian MIMO-BC,” IEEE J. Select. Areas Commun., vol. 24, no. 8, pp. 1640–1646, Aug. 2006. [11] S. Shi, M. Schubert, and H. Boche, “Rate optimization for multiuser mimo systems with linear processing,” IEEE Trans. Signal Processing, vol. 56, no. 8, pp. 4020–4030, Aug. 2008. [12] M. Codreanu, A. Tolli, M. Juntti, and M. Latva-aho, “MIMO downlink weighted sum rate maximization with power constraint per antenna groups,” in Proc. IEEE VTC Spring, Apr. 2007. [13] D. Guo, S. S. (Shitz), and S. Verdu, “Mutual information and minimum mean-square error in Gaussian channels,” IEEE Trans. Inform. Theory, vol. 51, no. 4, pp. 1261–1282, Apr. 2005. [14] H. Sampath, P. Stoica, and A. Paulraj, “Generalized linear precoder and decoder design for mimo channels using the weighted mmse criterion,” IEEE Trans. Commun., vol. 49, no. 12, pp. 2198–2206, Dec. 2001. [15] D. Palomar and S. Verdu, “Gradient of mutual information in linear vector gaussian channels,” IEEE Trans. Inform. Theory, vol. 52, no. 1, pp. 141–154, Jan. 2006. [16] K. B. Petersen and M. S. Pedersen, “The matrix cookbook,” feb 2008, version 20070905. [Online]. Available: http://www2.imm.dtu.dk/pubdb/p.php?3274 [17] R. Hunger, W. Utschick, D. Schmidt, and M. Joham, “Alternating optimization for MMSE broadcast precoding,” in Proc. IEEE ICASSP, vol. 4, 2006, pp. IV–757–IV–760. [18] M. Joham, K. Kusume, M. Gzara, W. Utschick, and J. Nossek, “Transmit Wiener filter for the downlink of TDDDS-CDMA systems,” in Proc. IEEE Seventh International Symposium on Spread Spectrum Techniques and Applications 2002, vol. 1, pp. 9–13, Sept. 2002. [19] N. Jindal, W. Rhee, S. Vishwanath, S. Jafar, and A. Goldsmith, “Sum power iterative water-filling for multi-antenna gaussian broadcast channels,” IEEE Trans. Inform. Theory, vol. 51, no. 4, pp. 1570–1580, Apr. 2005. [20] T. Cover and J. Thomas, Elements of Information Theory. Wiley & Sons, Inc., 1991. [21] W. Yu, W. Rhee, S. Boyd, and J. Cioffi, “Iterative water-filling for gaussian vector multiple-access channels,” IEEE Trans. Inform. Theory, vol. 50, no. 1, pp. 145–152, Jan. 2004.