Noncoherent energy-based communications for the massive SIMO MAC

Report 1 Downloads 45 Views
Noncoherent Energy-based Communications

1

for the Massive SIMO MAC Mainak Chowdhury, Alexandros Manolakos and Andrea Goldsmith Fellow, IEEE

Abstract We consider several single antenna transmitters communicating with a multi-antenna receiver, with the latter decoding transmitted information at the end of each symbol time. Motivated by the optimal noncoherent detector in a Rayleigh fading channel, we propose a noncoherent energy-based communication scheme that does not require any knowledge of instantaneous channel state information at either the transmitter or the receiver; it uses only the statistics of the channel and noise. The proposed scheme involves choosing the transmitted constellation points based on a minimum distance criterion and the receiver sensing only the average energy across all the receive antennas. We show, for very general channel fading statistics, that the performance of the proposed scheme is the same, in a scaling law sense, as that of the optimal coherent scheme exploiting perfect channel knowledge. Furthermore, we present a numerical evaluation of the performance of this scheme in representative fading and noise statistics.

Index Terms Massive MIMO, Noncoherent Communication, Energy Receiver, Multiuser Communication

The authors are with the Department of Electrical Engineering, Stanford University, Stanford, CA - 94305. Questions or comments can be addressed to {mainakch, amanolak, andreag}@stanford.edu. Parts of this work were presented at CISS, 2014 and at ISIT, 2014. This work is supported by a 3Com Corporation Stanford Graduate Fellowship, an Alcatel-Lucent Stanford Graduate Fellowship, an A.G. Leventis Foundation scholarship, the NSF Center for Science of Information (CSoI): NSF-CCF0939370, NSF grant 1320628, ONR grant N000141210063 and by a research grant from CableLabs.

April 22, 2015

DRAFT

I. I NTRODUCTION Systems with a large number of antennas at either the transmitters or receivers enjoy promising asymptotic properties. As shown in [1], the effects of fast fading and noise vanish and even zero forcing receiver architectures perform well in the asymptotic limit. In practice, however, many of these benefits are tied to the assumption of reasonably good channel state information (CSI) at the transmitter and the receiver. Obtaining perfect CSI, however, is often a bottleneck in systems with large antenna arrays operating either at high speeds or at high carrier frequencies. Acquisition of CSI in cellular systems requires the use of orthogonal pilot sequences in the individual cells. In fact, in a surprising result in [1], it was shown that the pilot contamination problem, due to the reuse of pilot sequences in adjacent cells, is the only factor limiting the rate of a cellular base station serving a fixed number of users with an infinite number of antennas. With a finite but large number of antennas, the channel estimation overhead manifests itself in additional performance degradation due to incorrect CSI [2], [3]. Moreover, in large antenna arrays, the RF chain design with perfect CSI acquisition quickly becomes prohibitive in terms of cost/energy, thereby motivating analog or hybrid antenna array designs [4]. In spite of these challenges, massive MIMO systems have been considered in recent years as a viable technology for millimeter wave (mmWave) communications [8], [9]. Such systems exploit huge chunks of unused spectrum in the 10 − 300 GHz band, and hence have the potential to support multiple gigabit-per-second data rates. However, given the difficulty of channel state information acquisition in massive MIMO systems, in this work we ask how much of the potential performance gains from a large antenna deployment can be achieved without any instantaneous channel state information (CSI) either at the receiver (CSIR) or at the transmitter (CSIT). We consider one-shot achievability schemes, i.e., we do not use coding across different symbol times or coherence intervals and we assume that the receiver decodes information at the end of each symbol time. We show that this one-shot noncoherent schem achieves the same scaling behavior of achievable rates with an increasing number of antennas as perfectly coherent schemes over larger coherence times and large decoding blocklengths (i.e., multi-shot schemes). The reason why we can communicate reliably in this setting even without knowing the instantaneous CSI 2

is that there is a lot of spatial diversity in a large antenna array, which, as we show later in the paper, may obviate the need for any channel state information. In this manuscript, we propose an average energy based noncoherent communication system with multiple (but fixed) single antenna transmitters and a receiver with a large number of antennas. The receiver performs only an average energy measurement and decodes based on the estimate of the average received energy at the end of each symbol time. For this scheme, we characterize the rate of decay of the achievable symbol error probability as a function of the number of the antennas at the receiver. Our analysis suggests that the proposed average energybased communication system can achieve the same performance as a communication system with perfect channel state information at the receiver (CSIR) in terms of the scaling law of achievable rates with an increasing number of receive antennas. We also consider a constellation design problem for the proposed communication scheme, based on the minimum distance criterion used to establish the scaling law result. We provide numerical results which demonstrate the performance of this scheme for representative fading and noise statistics. Note that while our achievable scheme uses symbol by symbol transmission and decoding, our results on the optimal scaling laws are applicable even to multi-shot schemes. In particular, multi-shot schemes can increase the error exponents associated with the probability of error or improve the pre-log factor in the optimal achievable rates, but does not change the rate scaling. Lastly, we point out that while our analysis of achievable rates does not consider coding across time, we show that incorporating coding does help in improving the error exponent associated with the achievable bit error rates. We also compare the performance of a joint multiuser constellation design with a single user design multiplexed in time. We observe that while a joint multiuser design can outperform time division schemes for certain channels, it is not uniformly better. We present an example of a channel for which this is not the case. The rest of the paper is organized as follows. In Section II we present the system model and the average energy-based noncoherent detector, together with the optimal noncoherent maximum likelihood (ML) decoder. Section III summarizes related work. Then, Section IV presents an upper bound on the symbol error probability as a function of the number of receive antennas.

3

Section V shows, using simple constellation designs, that for a general multiuser system, the performance of the energy-based decoder is the same, in a scaling law sense, as the performance of the optimal coherent detector. Section VI presents and solves a constellation design problem involving a minimum distance criterion (a more exhaustive treatment of constellation design is presented in [11], including comparison with the minimum distance criterion described here). Section VII presents numerical evaluations of the proposed schemes, demonstrating their symbol error performance in typical scenarios with an increasing number of receive antennas. We present conclusions and future research directions in Section VIII.

A. Notation We use [n] to denote the set {1, 2, · · · , n} where n is an integer. C refers to the set of complex numbers and Cn×m is the set of all complex-valued matrices of size n × m. For a matrix

H ∈ Cn×m , the (i, j)-th element is denoted by Hi,j . Re(·) and Im(·) represent the real and

imaginary terms, respectively. E [U ] denotes the expectation of the random variable U . CN (µ, R) represents the distribution of circularly symmetric complex Gaussian (CSCG) random vectors with mean vector µ and a covariance matrix R. The symbol , is used to denote a definition, . Θ(·) to indicate equality up to a constant, and = to denote equality to the first order in the   . exponent, i.e., an = bn means that limn→∞ n1 log abnn = 0. For x, y ∈ Rn , x > y ⇒ xi > yi , i ∈ [n]. II. S YSTEM M ODEL We consider m single antenna transmitters and one receiver with n antennas. The system is represented as y = Hx + ν,

(1)

with y ∈ Cn×1 , x ∈ Cm×1 , ν ∈ Cn×1 , H = [H1 , H2 , . . . , Hm ] ∈ Cn×m , Hj ∈ Cn×1 and each νi ∼ CN (0, σ 2 ). The ith element of Hj is distributed as Hi,j ∼ f (h), such that E[Hi,j ] = µj , 4

E [|Hi,j − µj |2 ] = σj2 , and |µj |2 + σj2 = 1 for all j ∈ [m]. We further assume that the density function f (h) is such that, for any fixed x ∈ C, the moment generating function of |yi |2 , i.e., 2

M (θ) , E[eθ|yi | ], exists and is twice differentiable in an interval around θ = 0. Many fading distributions fall within this model, e.g., Rayleigh and Rician fading [12], in which case Hi,j ∼ CN (µj , σj2 ). We assume that the instantaneous channel realization is unknown to both the transmitters and the receiver and investigate schemes for recovering transmitter data, based only on the knowledge of the statistics of the system. As already described, we focus only on symbol-by-symbol encoding and decoding schemes for achievability. We will show that a particular noncoherent one-shot scheme suffices to achieve optimal scaling under more general settings of larger coherence times and larger decoding blocklengths In this work, motivated by the simplicity of an average energy-based receiver design, we focus only on schemes which use the average energy of the received signal to reliably transfer the information; that is, the information is modulated on the energy of the transmitted symbols and the receiver estimates the energy of the received signals. A natural question that arises is how much is lost in performance by using the average energy-based decoder. As explained in the next section, we show that, in terms of the scaling law of achievable rates, there is no loss in performance with this decoder with respect to the optimal decoder using perfect channel state information as the number of antennas grows asymptotically large. We next describe the encoding and decoding procedure.

A. Encoder For j



[m], the j th transmitter transmits symbols from the constellations Cj

=

{c1,j , c2,j , · · · , cLj ,j } ⊂ C, subject to an individual average power constraint Lj 1 X |ci,j |2 ≤ 1, ∀j. Lj i=1

5

(2)

Here ci,j ∈ Cj is the ith constellation point of the j th transmitter and the total size of the constellation for transmitter j is Lj . The set of all possible transmitted points is denoted by C = {x ∈ Cm : xj ∈ Cj , ∀ j ∈ [m]}. By definition, we can verify that the cardinality of C is Q given by |C| = m j=1 Lj , L. We now describe our decoders.

B. Decoders We describe two different decoders. The first one, the average energy-based decoder, does not require phase information at any of the receiver antennas and is suboptimal for general fading statistics, in terms of minimizing the symbol error probability. The second decoder, the ML decoder, utilizes phase information at all the receiver antennas and is the optimal decoder with respect to minimizing the symbol error probability. This work analyses the performance of the former in the case of a massive SIMO MAC system. 1) Energy-based decoder: Based on its knowledge of the statistics of the channels and noise and of the constellations {Cj }m j=1 , the decoder divides the positive real line into non-intersecting intervals or decoding regions {Ix }x∈C . In order to detect the transmitted x∗ ∈ C, the decoder first computes kyk2 ∈ R+ , n

(3)

i.e., it computes the average received power across all the antennas. It then returns   kyk2 xˆ ∈ x ∈ C : ∈ Ix . n Then, the probability of error given that x∗ is transmitted is given by Pe (x∗ ) , Pr{x∗ 6= xˆ}. 2) Noncoherent ML decoder: While the energy-based decoder is the optimal choice for some channel statistics and has good asymptotic properties, for general statistics, it may not be optimal. The optimal detector with respect to minimizing the probability of symbol error, assuming equiprobable signaling and only the knowledge of the channel distribution at the receiver, is given by the following maximum likelihood decoder: xˆ ∈ argmaxx f (y), 6

where f (·) is the likelihood function of the output y given the channel and noise statistics, and the fact that x was transmitted. It turns out that the two decoders are the same for some representative channel statistics. We summarize this observation in the following lemma. Lemma 1. For zero mean Gaussian statistics on both the channel noise and the channel coefficients, (i.e., in i.i.d. Rayleigh fading with i.i.d. AWGN), the noncoherent ML decoder reduces to the average energy-based decoder. Proof: The proof is presented in Appendix B.

III. R ELATED W ORK The use of noncoherent schemes in wireless systems is not new. The earliest incarnations of wireless cellular systems (from 1G systems which used differential modulation to 2G systems which used frequency shift keying [12]) were noncoherent due to limitations in device manufacturing and due to not-so-strong motivations in making spectrally efficient systems. As manufacturing improved and spectrum became more expensive, coherent systems became prevalent due to their spectral efficiency (almost all 3G and 4G systems). With many emerging applications, however, spectrum is no longer a bottleneck and processing complexity/power consumption are equally important, if not more important, considerations. An important and power hungry component of many high speed systems is the RF frontend (the clock recovery and phase acquisition circuits), and the analog to digital converter (ADC). This is increasingly a bottleneck not only in low power applications but also in applications with larger antenna arrays [13] and higher carrier frequencies. This makes the simplicity of noncoherent designs especially attractive in such systems. The resurgence in interest in noncoherent designs for modern communication systems led to a lot of fundamental work in different aspects of noncoherent system design in the late 1990s and early 2000s. A noncoherent communication system that uses the Generalized Likelihood Ratio Test (GLRT) is described in [14]. In this scheme, the channel and the transmitted symbols are

7

jointly estimated at the receiver. The authors propose a minimum distance criterion for signal and code design by characterizing the performance of the GLRT decoder in the AWGN channel for the high SNR regime. They subsequently apply this to DPSK and QAM alphabets. As described in [14], the GLRT decoder is identical to the noncoherent ML decoder in general settings (as long as the phase of the channel distribution is uniformly distributed over [0, 2π] and the signals are of equal energy). Furthermore, [15] focuses on the noncoherent ML decoder and proposes signal constellation designs using a metric motivated by a union bound on the probability of error. Similar metrics, motivated by a high SNR analysis, are presented in [16] where the worstcase chordal distance is employed to space the codewords as far apart as possible. An important feature of this line of work is their emphasis on high SNR. Fundamental insights for the high SNR regimes in a block fading channel model were derived in [17], [18], which also derived optimal signaling distributions for the high SNR noncoherent channel. Optimal high SNR input distributions with coherence times shorter than the total number of antennas have been derived in [19]. The capacity expressions in some of these works suggest in particular that small coherence times or small rank of the channel matrix can be very detrimental. Multiuser counterparts of these ideas can be found in [20], [21]. [20] shows that time division strategies may be capacity achieving in noncoherent high SNR settings. There is also a line of work on characterizing the achievable rates in block fading channels (possibly taking into account the channel estimation overhead) in the regime of finite SNR and coherence times. In [6], the authors consider the effect of pilot based channel estimation on achievable rates and identify the capacity loss due to channel estimation as being proportional to the square root of the Doppler of the channel. In [22] the authors determine the mutual information for Gaussian codebooks on block fading channels for general SNR, coherence time and number of transmit/receive antennas. In that work, however, it is not immediately clear how the achievable rates scale with the number of antennas. Moreover while Gaussian codebooks are capacity-achieving under general conditions for additive Gaussian noise channels, under multiplicative noise they may no longer be optimal. For example, [23] and [24] show that the capacity achieving distribution for a single input single output (SISO) channel with Rayleigh and Rician fading and a coherence time of 1 is achieved by a discrete input distribution. While the

8

structure of the capacity achieving distribution for a general noncoherent SIMO channel is not known to the best of our knowledge, our analysis in this paper is applicable to characterizing the achievable rates for general discrete input distributions in SIMO channels with a large number of receiver antennas. In this work, we consider a finite transmit power and an asymptotic analysis in the number of receiver antennas. To the best of our knowledge, we are the first to consider an average energybased noncoherent massive MIMO system with an asymptotically large number of antennas. Moreover, we observe that asymptotics in the number of receiver antennas can have a very different qualitative behavior compared to the asymptotics of high SNR.

IV. E RROR P ROBABILITY U PPER B OUNDS AND THE R ATE FUNCTION In this section we analyse the performance of the proposed energy detector in the massive SIMO MAC. From our system model, we have kHx + νk2 kHxk2 kνk2 Re((Hx)∗ ν) kyk2 = = + +2 n n n n n m X X  → |µi |2 + |σi |2 |xi |2 + 2 Re((µi xi )∗ (µj xj )) + |σ|2 a.s. as n → ∞. i=1

i<j

Thus, in the asymptotic regime of large n, the received energy statistic converges almost surely to a deterministic quantity depending on the statistics of the channel and the constellation points transmitted. For a finite n the value may deviate from this limiting value by a noise term. To characterize the noise appearing in the system, we use the following lemma [25]: Lemma 2. For any a > 0 and zero mean, i.i.d. random variables u1 , . . . , un , we have that   Pn . i=1 ui ≥ a = e−nI(a) , Pr n  where I(a) = supθ>0 θa − log(E[eθU ]) . For completeness, a short proof of this fact is presented in Appendix A. We use Lemma 2 to characterize the deviation of kyk2 /n from its limiting value r(x), where m X X  r(x) , |µi |2 + |σi |2 |xi |2 + 2 Re ((µi xi )∗ (µj xj )) + σ 2 i=1

i<j

9

(4)

is the expectation of the received energy statistic kyk2 /n. To do so, choose ui in Lemma 2 to be

2 X Hi,j xj + νi − r(x), ui =

(5)

j

thereby making ui a zero mean random variable. Note that ui depends on c and the statistics of Hi,j and νi . With this definition, the following holds P ui kyk2 = r(x) + i . n n Using Lemma 2, we get that  Pr

 kyk2 . − r(x) > a = e−nI1,x (a) , n

where   I1,x (a) , sup θa − log E eθU . θ>0

Similarly, by invoking the same lemma for the random variable −ui , we get the following   kyk2 . Pr − r(x) < −a = e−nI2,x (a) , n where I2,x (a) , sup θa − log E[e−θU ]



.

θ>0

Combining the above two results and using a union bound, we get that   kyk2 . Pr − r(x) > a = e−nIx (a) , n where Ix (a) , min (I1,x (a), I2,x (a)) is defined as the rate function of the (possibly multiuser) constellation point x. Observe that Ix (a) is the error exponent associated with Pe (x), i.e., Ix (a) = lim − n→∞

log(Pe (x)) , n

when the transmitters send x, and the receiver uses a decoding interval of Ix = (r(x)−a, r(x)+a]. We now show how these results may be used to design good constellations for infinite and/or finite n systems. Before that, we collect some observations about Ix (a) in the following lemma, which have been proved in Appendix C. 10

Lemma 3. The rate function Ix (d) enjoys the following properties. 1) The rate function is non-negative and monotonically increasing for positive a. 2) For a fixed d and a scalar x ∈ C, the rate function is monotonically decreasing with increasing x. 3) For any finite δ0 > 0, there exists a large enough σ 2 (noise variance) such that Ix+δ (a) Ix (a) − 1 ≤ , ∀δ < δ0 .

(6)

4) The rate function for any zero mean U with a twice differentiable moment generating function around 0 satisfies Ix (a) 1 = . 2 a→0 a 2 E [U 2 ] lim

(7)

The third point implies that in the limiting case of low SNR, the rate function Ix (a) does not depend on the transmitted point x. Recall that the rate function depends on σ 2 through the random variable ui defined in (5). Moreover, for a small a, the rate function is quadratic with a weight determined by the second order statistics of ui defined earlier.

V. A SYMPTOTIC T HROUGHPUT C HARACTERIZATION In this section, we compare the achievable throughput of our energy based decoding scheme with a coherent scheme as a function of the number of receiver antennas. We first define the symmetric ergodic capacity m X   C, E log(1 + σi (H)2 /σ 2 ) i=1

of the coherent channel where σi (H) is the ith largest singular value of H and σi+1 ≥ σi ≥ 0 for all i in [m]. In the limit of large n, it can be shown that 0 < ξ1 ≤

σi2 n

≤ ξ2 , almost surely

for all i in [m] and for some ξ1 , ξ2 > 0 independent of n. This gives us that C = Θ(log n) for coherent systems. The following lemma shows that this same scaling behavior can be achieved by our noncoherent scheme.

11

Lemma 4. There is an m-user constellation with bounded average power per user which achieves ˜ log2 n bits per transmission per transmitting user for some K ˜ > 0, with vanishing a rate of K probability of error with an increasing number of receiver antennas n. Proof. We show this through explicitly constructing a constellation which has the above scaling behavior. We fix M to be an integer and choose the following constellations: (r ) 2(i − 1)M j−1 Cj = : i ∈ [M ] , Mm for all j ∈ [m]. Note that the energy in each constellation point of Cj is bounded above by 2.

Note also that this constellation satisfies the average power constraint (2) on all users, the j th

user uses the constellation Cj and that each transmitter experiences a rate log2 (M ). Using these constellations, we note that min |r(c1 ) − r(c2 )| =

c1 ,c2 ∈C

2 , Mm ˜

where we use that fact that |µj |2 + σj2 = 1, Re(µ∗j2 µj1 ) ≤ 1 for j1 6= j2 . Choosing M = nK for ˜ > 0, we get that each user achieves a rate of log M = K ˜ log n. some K We now proceed to bound the probability of error. For a small a, we note that using Lemma 3, we have min Ix (a) ≈ min x

x

a2 a2 , = 2E[U 2 ] 2α0

(8)

where α0 , max(E[U 2 ]). x

It follows that (8) represents a lower bound on the worst case error exponent across all constellation points. We now note that E[U 2 ] is finite since it may only depend on the channel and noise statistics through the first four moments (which are bounded due to the existence of moment generating functions) and through the first four powers of the transmitted symbols. Moreover, since |ci,j |2 ≤ 2 for all i, j, E[U 2 ] can be bounded above by a constant depending

only on m and independent of n. It may also be shown that in the limit of low SNR, E[U 2 ] is the same regardless of the transmitted constellation points. Thus a sufficient criterion to get ˜ log(n) rate is to guarantee a high minimum a vanishing error probability while achieving a K 12

distance between r(c1 ) and r(c2 ) for c1 , c2 ∈ C, c1 6= c2 . By the small a characterization of the rate function in Lemma 2, we can choose decision regions for which the symbol error rate Pe satisfies Pe ≤ e−nt/(0.5M

m )2

˜ m 2 K ) )

= e−nt/((0.5n

˜ 0 1−2Km

≤ e−t n

,

˜ sufficiently small, i.e. K ˜ < where t0 is a constant independent of n. By choosing K

1 , 2m

we see

that the error rate vanishes as n → ∞. This establishes Lemma 4. We now mention a surprising outcome of Lemma 4 which is also the main thesis of our paper. Theorem 1. Noncoherent communications achieve the same scaling behavior of achievable rates in a massive SIMO system as perfectly coherent systems (Θ(log n)) in the limit of a large number of receiver antennas. Proof. Follows directly from Lemma 4 and the observation that the one shot scheme used to establish Lemma 4 is a special case of more general noncoherent schemes over larger coherence times and decoding blocklengths. Note that the proof of Lemma 4 above uses a sufficient condition to guarantee a vanishing ˜ log(n) scaling of achievable rates. probability of error while at the same time giving a K This condition involves focusing on just the minimum distance between (possibly multiuser) constellation points at the receiver. We explore different aspects of this constellation design in the following two sections.

VI. M INIMUM D ISTANCE C ONSTELLATION D ESIGN We first consider single user constellation designs and then present how multiuser designs are different.

13

A. Design problem for single user We have the following optimization problem maximize 1 L1

P L1 C 2 i=1 |ci,1 | ≤1

min

x1 ,x2 ∈C,x1 6=x2

|r(x1 ) − r(x2 )| ,

(9)

where r(x) evaluates to r(x) = |µ1 |2 |x|2 + σ 2 . The solution to this is given by |ci,1 |2 =

2(i − 1) , i ∈ {1, 2, · · · , L1 }, L1 − 1

(10)

where L1 = L. Note that the phase of the particular constellation points we choose does not matter, since we use an average energy decoder and the noise is complex Gaussian with uniform phase. So we choose the phase to be zero, i.e., ci,1 ∈ R+ . An achievable scheme follows by setting the decision regions Ix = (r(x) − a, r(x) + a], with a =

1 , L−1

which leads to the following upper bound on the probability of symbol error 1X . Pe (x) ≤ UL = max e−nIx (a) , (11) Pe = x∈C L x∈C

where UL ,

 1  −nI1,c1,1 (a) −nI (a) −nI (a) e + e−nI2,c2,1 (a) + e−nI1,c2,1 (a) + · · · + e 1,cL1 −1,1 + e 2,cL1 ,1 . L

Numerical evaluation of the symbol error rates achieved using this scheme is presented in Section VII.

B. Design problem for 2 users Similar to the one user case, to guarantee vanishing Pe as n increases, we treat the constellation design as a joint codebook design problem and we maximize the minimum distance between the received constellation points maximize

1 Lj

C P Lj 2 ≤1, for all j∈[2] |c | i,j i=1

min

x1 ,x2 ∈C,x1 6=x2

14

|r(x1 ) − r(x2 )|.

(12)

This, in general, is a non-convex problem due to the absolute value in the objective function. In order to simplify the problem, one can consider a total ordering on the received constellation points {r(x)}x∈C . With the total ordering constraint, (12) becomes efficiently solvable in some special cases. Depending on the channel statistics, the problem can be either a linear program, convex quadratic program or a non convex quadratic program. The details are presented in Appendix D. The performance obtained from these schemes is presented in Section VII.

C. Joint multiuser codebook design versus single user design with time division A natural way to extend the single user designs to multiple users is to employ a time division strategy and use the single user codes for each user in a round robin fashion. In this section, we investigate how this time division strategy compares with a joint multiuser codebook design. 1) Rayleigh fading: We show that the suggested average energy-based system cannot achieve a higher rate function (or error exponent) than that of the corresponding single user designs when used in a time-division fashion. Specifically, we show in Appendix F that the joint codebook design always yields a smaller error exponent compared to a time division strategy when the channels experience symmetric Rayleigh fading. 2) AWGN channel and Rician fading: In this section we show that if the multiuser channels have non zero correlation (e.g. both of them are Rician channels with the same non zero K factor) a joint multiuser codebook design can outperform a time division strategy. The main intuition behind why this is the case is that the possible energy levels of the received signal for joint multiuser transmissions in such a case are more “separated” (and hence more distinguishable) due to the constructive interference of transmissions from different users at the receiver. Consider 2 users with the same AWGN channel1 with noise variance σ 2 . In time division, each user transmits in a round robin fashion for one slot from a constellation of size L2 . In the suggested multiuser design, both users transmit concurrently for two slots using a joint 1

any Rician channel with non zero K could be used. The simplest example is to use an AWGN channel which corresponds

to infinite K.

15

constellation of size L, such that the average transmit power across two consecutive slots at each user is the same. In the time division scheme, the best single user constellation for low SNR is given by (10), that is ) 4 √ , 2 , (13) Ctdma = 0, 3  with decision regions Ix = x2 + σ 2 − 13 , x2 + σ 2 + 13 , ∀x ∈ Ctdma . The above constellation ( r

2 , 3

r

satisfies the average power constraint and, in the limit of low SNR (i.e. high σ 2 ), it leads to the best rate function for the TDMA scheme, which is     1 1 tdma min Ix ≈ I0 . x 3 3

(14)

In the above, I0 (·) is the rate function associated with the transmission of the zero symbol. Observe that, as the SNR becomes lower and lower, the best rate function is more and more independent of the transmitted symbol (Lemma 3). For the joint multiuser codebook design, consider the following constellations: ( ) ( r ) r √ 2 4 C1 = , +  , C2 = 0, − 3 , 3 3

(15)

for any 0 ≤  < 2/15. The above constellations satisfy the average power constraint for each user across two slots, assuming the users interchange the constellations, i.e., user 1 transmits from C1 during 1 transmission slot, and from C2 during the second time slot; user 2 transmits C2 when user 1 transmits from C1 and from C1 otherwise. Since 1 X |c|2 = 1, 2 c∈C ∪C 1

2

and the transmitters employ equiprobable signalling, each user transmits on average signals of power 1. In the limit of infinite receive antennas, the received constellation points are s  s ) (    4 4 4 2 2 − 3 , 2 + 2 + 2 − 3 −  , , + , − 2 + 2  3 3 3 3 3

(16)

and the decision regions are such that the boundary points are located in the midpoint between consecutive received constellation points. It is easy to verify that for 0 ≤  < 2/15, the above

16

constellation and decision regions lead to a minimum distance between received points which is always more than 13 , i.e., in the limit of low SNR, the rate exponent is     1 1 2-user min Ic + ˜ ≈ I0 + ˜ , c 3 3

(17)

where ˜ > 0. Based on Lemma 3 and equations (15) and (17) it follows that the joint multiuser codebook design can achieve a higher rate function than the single user design. Intuitively, the reason behind this is the fact that the multiuser design can make use of the constructive interference between the channels of the two transmitters. In situations where this constructive interference is small (or absent altogether), the gains are small or even nonexistent (e.g., same or symmetric Rayleigh fading for all users). Numerical examples of this are presented in Subsection VII-C.

VII. N UMERICAL RESULTS We now present different aspects of the proposed communication system design through numerical simulations. Subsection VII-A demonstrates the single user performance for the constellation design of Subsection VI-A for Rician fading with different values of the K-factor. The corresponding results for two users are presented in Subsection VII-B, in which case we use the constellations generated as explained in Subsection VI-B. Comparison of the joint multiuser codebook design for two users with the corresponding time division scheme is presented in Subsection VII-C. Subsection VII-D compares the energy-based detector with the noncoherent ML detector for both Rayleigh and Rician fading. Finally, Subsection VII-E demonstrates through an example that time coding across multiple channel realizations could provide additional gains over the one-shot system.

A. Single user performance We plot the Monte Carlo estimate of the probability of symbol error with the corresponding analytical bound UL for the case of a single user with an increasing number of receive antennas and a constellation with equidistant power levels for a Rician fading channel with K-factor 17

0

0

10 Probability of symbol error

Probability of symbol error

10

−2

10

Pe, L = 4

−4

10

UL, L = 4 Pe, L = 8 UL, L = 8

−6

10

1

10

−2

10

Pe, L = 4

−4

10

UL, L = 4 Pe, L = 8 UL, L = 8

−6

2

3

10 10 Number of antennas

10

4

10

1

10

(a) σ 2 = 1 (low SNR)

2

10

3

10 Number of antennas

4

10

(b) σ 2 = 0.1 (high SNR)

Fig. 1: SER performance of the single user design in Rayleigh fading (K = 0) at low and high SNR.

K = 0 (Fig. 1) and K = 10 (Fig. 2). We consider additive white Gaussian noise with variance σ 2 = {0.1, 1}, and 2−bit (L = 4) and 3−bit (L = 8) constellations. We observe that, as the LOS factor increases, the performance of the system improves significantly. Furthermore, we see that for the current constellation design, even if it is asymptotically the best as σ 2 increases, an impractically large number of receive antennas is needed in order to get a reasonable performance for values of σ 2 of practical interest. In [26], we demonstrate that by optimizing the constellations, it is possible to get good performance with a number of antennas that is quite feasible even with today’s commercial offerings of large antenna systems.

B. Two users performance We now consider two users, whose constellation is designed as described in Subsection VII-B for the same channels as above, i.e., Rician fading with K = 0 (Fig. 3) K = 10 (Fig. 4) and σ 2 ∈ {0.1, 1}, for a 1−bit and 2−bit constellation per user with the same average power per user as before. Observe that the BER for L = 1, i.e., On-Off Keying, is significantly better than L = 2 as the number of antennas increase. Furthermore, the gap between the upper bound and the Monte Carlo estimate on the probability of symbol error appears to be constant, which is 18

0

0

10 Probability of symbol error

Probability of symbol error

10

−2

10

Pe, L = 4

−4

10

UL, L = 4 Pe, L = 8

1

10

Pe, L = 4

−4

10

UL, L = 4 Pe, L = 8

UL, L = 8

−6

10

−2

10

UL, L = 8

−6

2

3

10 10 Number of antennas

10

4

10

1

2

10

(a) σ 2 = 1 (low SNR)

10

3

4

10 Number of antennas

10

(b) σ 2 = 0.1 (high SNR)

Fig. 2: SER performance of the single user design in Rician fading (K = 10) at low and high SNR. 0

0

10 Probability of symbol error

Probability of symbol error

10

−2

10

−4

Pe, L = 2 UL, L = 2 Pe, L = 4 UL, L = 4

10

−6

10

1

10

−2

10

−4

Pe, L = 2 UL, L = 2 Pe, L = 4 UL, L = 4

10

−6

10 2

10

3

10 Number of antennas

4

10

1

10

(a) σ 2 = 1

2

10

3

10 Number of antennas

4

10

(b) σ 2 = 0.1

Fig. 3: SER performance of the 2−user design in Rayleigh fading (K = 0) at low and high SNR

a good indication that the former serves well as a good approximation of the latter, at least in Rician fading channels.

C. Joint multiuser codebook versus time division scheme In this scenario, we numerically compare the joint two user codebook designs of Subsection VI-B, with the case of time division between the two users when average power does not exceed one. 19

0

0

10 Probability of symbol error

Probability of symbol error

10

−2

10

−4

Pe, L = 2 UL, L = 2 Pe, L = 4 UL, L = 4

10

−6

10

1

10

Pe, L = 2 UL, L = 2 Pe, L = 4 UL, L = 4

−2

10

−4

10

−6

2

10

3

10

4

10 Number of antennas

10

1

10

(a) σ 2 = 1

2

10

3

10 Number of antennas

4

10

(b) σ 2 = 0.1

Fig. 4: SER performance of the 2−user design in Rician fading (K = 10) at low and high SNR

We assume that, for the joint multiuser codebook, the average transmitted power per user over two consecutive slots does not exceed one and that each user transmits one bit per symbol. The numerical results are for Rician fading (Fig. 5) with K = 100 and σ 2 = 12 . Observe that the joint multiuser design achieves better performance compared to the design with time division across the users. 0

Probability of symbol error

10

−2

10

−4

10

−6

10

Multiuser P e Multiuser U L TDM U L TDM P e 2

10

3

Number of antennas

10

Fig. 5: 2−user design vs. time division coding for Rician fading with K = 100 and σ 2 = 1 at a rate of 1 bit per user. The 2−user multiuser scheme achieves better performance compared to the TDM single user design.

2

Considering even higher σ 2 leads to the same conclusions.

20

D. Comparison with the noncoherent ML detector As shown in Appendix B, using the noncoherent ML detector (which in general requires using phase information in addition to energy measurements) does not improve performance if the channel and noise statistics are zero mean Gaussian. In most other cases, phase detectors give significant performance improvements. We present numerical results for the case of a single user using the optimal ML decoder and compare it with the average energy-based decoder with equidistant constellation points. We consider a low SNR (σ 2 = 1), and Rician fading with both low (K = 0.1) and high (K = 100) K-factors (Fig. 6). Recall that having equidistant constellation points is optimal for the average energy-based decoder for low SNR. 0

Probability of symbol error

10

−1

10

−2

10

1

ML, K = 20 dB Energy-based, K = 20 dB ML, K = −10 dB Energy-based, K = −10 dB

10

2

10 Number of antennas

3

10

Fig. 6: Performance comparison of the noncoherent ML and energy-based decoders for a 2−bit constellation in a Rician fading scenario with low SNR (σ 2 = 1) and low (K = 0.1) and high (K = 100) K-factor. Observe that at a low K-factor, the difference between the ML and the energy-based detector is negligible.

E. Coding In this section, we consider the case of a single transmitter which uses codewords that span multiple (denoted as T for some T > 1) symbol durations. We also assume that the channel 21

changes during each symbol. We show through an explicit example that it is possible to get a strictly better symbol error exponent for the same average rate by transmitting and jointly decoding codewords over T = 3 symbols even though the asymptotic scaling law is the same for both cases with an increasing number of receive antennas. Specifically, consider a receiver with n antennas which measures the energies of three consecutive symbols, i.e.,   (1) 2 ky k ky (2) k2 ky (3) k2 , , n n n

(18)

where y (l) ∈ Cn×1 is the received signal at the l−slot such that √ y (l) = H (l) pl + ν with H (l) ∈ Cn×1 and pl is the transmitted power, such that {p1 , p2 , p3 } comes from some codebook C (3) .

Consider the case of σ 2  1, i.e., the noise power is much greater than the average signal power P = 1. In this case, using Lemma 3 it follows that the error exponent of each constellation is approximately independent of the transmitted power, i.e., the rate function would depend only on the minimum distance between the neighboring constellation points, which are now points in the RT space, and thus a minimum-distance based constellation design would be asymptotically optimal. For no space time coding where the user transmits one bit of information, dmin = 2, since the constellation points (p1 , p2 , p3 ) can take only 2N = 8 possible values, i.e., C = {{0, 0, 0}, {0, 0, 2}, {0, 2, 0}, {2, 0, 0}, {0, 2, 2}, {2, 0, 2}, {2, 2, 0}, {2, 2, 2}} , for which, the minimum Euclidean distance is dmin = 2. Using a different code, however, one can achieve a minimum distance of dmin = 2.18 with the same average power. This constellation is as follows. C ∗ = {{0, α, α}, {0, 0, 0}, {0, 0, α}, {α, 0, α}, {0, 0, 2α}, {0, α, 0}, {α, 0, 0}, {α, α, 0}}, where α = 2.18. Thus, for σ  1, coding helps to increase the Euclidean distance between constellation points and increase the rate function. Note that the latter may not be the best performance that can be achieved with a block code of length three and may be improved upon

22

0

Probability of symbol error

10

−1

10

−2

10

C

C∗ −3

10

1

2

3 4 5 Number of antennas

6

7 4

x 10

Fig. 7: Comparison of probability of symbol error between C and C ∗ .

further by searching more exhaustively over constellation points. The empirical SER performance of the above codes are shown in Fig. 7. In [10] preliminary results are presented which demonstrate in more detail such gains by constructing either a specific codebook or applying random coding bounds.

VIII. C ONCLUSIONS AND F UTURE W ORK We consider a noncoherent average energy-based communication scheme for the massive SIMO MAC. Using a one shot transmission and decoding scheme, we characterize the symbol error rate performance of the proposed system for general fading and noise statistics. Our analysis shows that in terms of the scaling law of achievable rates with an increasing number of receive antennas, the performance of this scheme comes arbitrarily close to that of a coherent system exploiting instantaneous channel state information and coding over large coherence times and blocklengths. Moreover, we present a simple constellation design scheme based on maximizing the minimum distance - an analytically tractable sufficient condition to guarantee vanishing probability of error with an increasing number of antennas. The achievable schemes presented in this work suggest that the spatial diversity already present in a multiple antenna system can not only help us design simple systems, but can also 23

help us achieve close to optimal performance. In particular, in this work, we show that this holds even for a symbol-by-symbol communication system. However, in general, taking into account multiple time slots, finite SNR, or detectors beyond average energy detectors will help the noncoherent performance. A comprehensive theory analyzing when to use noncoherent communication over coherent communication is the ultimate goal of this line of research. While coherent communication underlies many modern communication system designs, the increased overhead/difficulty of channel state acquisition together with the increased spatial diversity in future systems may make noncoherent communication a competitive paradigm in the not-toodistant future.

IX. ACKNOWLEDGEMENTS The authors would like to thank the editor and the anonymous reviewers for their thoughtful comments which helped to greatly improve the presentation in the paper. The first author would like to thank Yair Yona for helpful discussions. The authors would also like to thank Jinyuan Chen for insightful discussions concerning the use of TDMA schemes in multiuser systems.

A PPENDIX A In this appendix, we outline a proof for the large deviation bounds stated in Lemma 2. By an application of the Markov inequality, we have that   Pn   Pn ui θ i=1 θa i=1 ui ≥ a = Pr e n ≥ e for some θ > 0 Pr n h Pn u i i i=1 E eθ n ≤ eθah i Qn u θ ni i=1 E e = eθa ≤ e−nI(a) ,   h θU i where I(a) , sup θ >0 nθ a − log E e n . We now establish that, for all  > 0, n  Pn  i=1 ui Pr ≥ a ≥ e−n(I(a)+) , n 24

 ∗  for n large enough. Consider the θ∗ such that I(a) = θ∗ a − log E eθ U . We can then define a change of measure



dµ0 (u) =

eθ u dµ(u), E [eθ∗ U ]

where µ(u) is the original measure (e.g., induced by the definition (5)). Under this change of R measure, note that udµ0 (u) = a (this follows from the sufficient condition for optimality of θ∗ ). Thus if µn (and µ0n ) represents the distribution of

Pn

i=1

n

ui

under µ (and µ0 respectively), then

we have that Z

 ∗  ∗ n E eθ U e−θ x dµ0 (x),

µn (A) = x∈A

where A = {x | x ≥ a}. A lower bound on µn (A) can then be had by considering, for some δ > 0, Aδ = {x | a ≤ x ≤ a + δ}. We get that  ∗  n ∗ µn (A) ≥ µn (Aδ ) ≥ E eθ U e−θ (a+δ) µ0n (Aδ ). Noting that by the CLT, µ0n (Aδ ) → lim

n→∞

1 2

as n → ∞, we get that

− log(µn (A)) ≥ −I(a) − θ∗ δ. n

By choosing a δ small enough we get that  Pn  i=1 ui Pr ≥ a ≥ e−n(I(a)) , n for n large enough.

A PPENDIX B In this appendix, we show how the ML decoder mentioned in Subsection II-B2 reduces to the energy-based decoder in Subsection II-B1 under the stated assumptions on the Gaussian statistics of the channel. We first note that given a point x ∈ Cm , the output of the channel is distributed as y = Hx + ν, and has a density function of y∼

(π(|x|2

1 2 2 2 e−kyk /(|x| +σ ) . 2 n + σ )) 25

Thus the log-likelihood function L(x) for zero mean Gaussian channel and noise is as follows: L(x) = −w1 (x)kyk2 + w2 (x), for appropriate functions w1 (x) and w2 (x). Thus the noncoherent ML decoder corresponds to specifying decision regions for the sufficient statistic kyk2 ; which is precisely the average energybased decoder.

A PPENDIX C In this appendix, we prove the properties of the rate function mentioned in Lemma 3. We first show monotonicity and positivity for all a > 0.

Monotonicity with a We note that for a1 > a2 ,     θa1 − log E eθU > θa2 − log(E eθU ). Thus     sup θa1 − log E eθU > sup θa2 − log E eθU , θ

θ

or Ix (a1 ) > Ix (a2 ). Note also that for all a > 0, Ix (a) ≥ Ix (0) = 0. Thus strict monotonicity and positivity of Ix (a) for a ≥ 0 is established. Monotonicity with x for x ∈ C   For x1 , x2 ∈ C such that |x1 |2 > |x2 |2 , we show that E eθU is larger for x1 than for x2 . This establishes the monotonicity claimed in the theorem.

26

We start by writing out U in terms of x, as was done in (5), and simplifying for m = 1: ui = |Hi,1 x + νi |2 − |x|2 − σ 2 . Conditioning on the distribution of Hi,1 and observe that   |Hi,1 |2 |x|2 θ −|x|2 −σ 2 2 1−2θσ  θui  e e  E e = EHi,1  1 − 2θσ 2   (k|Hi,1 |2 −1)|x|2 θ −σ 2 2 1−2θσ e e , = EHi,1  1 − 2θσ 2 for k ,

1 1−2θσ 2

(19)

(20)

>1. h

|x|2 y

i

h i |x|2 y+ = E e −

We now observe that for any random variable y, with E [y] > 0, E e h i 2 E e−|x| y− is an increasing function of |x|2 , where y+ = max(y, 0), y− = max(0, −y). This may be seen by computing the derivative with respect to |x|2 and observing that th derivative is positive, i.e., h i h i 2 2 E y+ e|x| y+ − E y− e−|x| y− > E[y] > 0. Applying this with y = k|Hi,1 |2 − 1, we get our result.

Low a asymptotics of Ix (a) We first observe that Ix (a) is a strictly convex function. This follows from the fact that supremum   (over θ) of convex functions g(θ) = θa−log E eθU is strictly convex and that Ix (a) is strictly   monotonic. Moreover, by the existence of E eθU , note that g(θ) is differentiable, at least where E[eθU ] exists, and is finite. By differentiating, we write down a necessary condition for the θ∗ which maximizes g(θ).  ∗  ∗ a E eθ U = E[U eθ U ]. We now characterize the dependence of θ∗ on a. By taking partial derivatives with respect to a on both sides we get that

 ∗  E eθ U ∂θ∗ = . ∂a E [U 2 eθ∗ U ] − E [U eθ∗ U ] 27

(21)

We now note that at a = 0, θ∗ = 0, and that the denominator in the right hand side of (21) is a finite constant and positive. This suggests that as a → 0,

θ∗ a

= s, for some constant s > 0. We

can find out the constant s by noting that

∂θ∗ 1 |a=0 = s = . ∂a E [U 2 ]

(22)

From this we observe that  ∗  θ∗ a − log E eθ U 1 . lim = 2 a→0 a 2 E [U 2 ]

Low SNR behavior of the rate function We start from the definition of the rate function   Ix (a) = sup θa − log E eθU , θ>0

where

2 X ui = Hi,j cj + νi − r(c), j

Note that if νi ∼ N (0, σ 2 ), we have the following equivalent definition for Ix :  h U i  a Ix (a) = sup θ 2 − log E eθ σ2 σ θ>0

The latter can be thought of as a rate function I˜c for the random variable U˜ = U/σ 2 evaluated at a/σ 2 . By using the results of the previous subsection, we observe that lim σ 4

σ→∞

I˜c (a/σ 2 ) 1 = lim h i = k, 2 σ→∞ a E U˜ 2

for some 0 < k < ∞ independent of c if kck∞ < ∞. This establishes the claim that for a large enough σ 2 , i.e. in the low SNR regime, the rate functions are identical, i.e. independent of c.

28

A PPENDIX D In this appendix, we show how the optimization problem (12) for 2 users can be solved efficiently for certain channel statistics by suitably restricting the search space of possible constellation points in C. To be specific, without loss of generality, assume that ci,j ≤ ck,j , ∀k > i, j ∈ [m], that is, there exists a partial ordering of the transmit power levels of each user. This leads to r(ci,j ) ≤ r(ck,j ), ∀k > i, j ∈ [m].

(23)

Coming up with a total ordering of {r(x)}x∈C such that it is consistent with (23), would lead to the best solution of (12) which satisfies the chosen ordering. Let’s reparametrize (12) as follows: p1,i =



ci,1 , pi,2 =



ci,2 , sk,l = r([p2k,1 , p2l,2 ]) = p2k,1 + p2l,2 + 2pk,1 pl,2 µ1 µ2 .

(24)

By expanding the partial order to a total ordering we formulate the following (possibly non convex) quadratic problem, where the total ordering information is used in the first constraint. maximize

{pi,j }i∈[Lj ],j∈[2] {sk,l }k∈[L1 ],l∈[L2 ]

s.t.

t

(25)

sk1 ,l1 + t < sk2 ,l2 , if sk1 ,l1 < sk2 ,l2 , sk,l = p2k,1 + p2l,2 + 2pk,1 pl,2 µ1 µ2 , Lj 1 X 2 p ≤ 1, and pi,j ≥ 0 ∀i, j. Lj i=1 i,j

1) Special Cases: No LOS, Only LOS: In the case of zero LOS in any channel, i.e., µ1 = 0 or µ2 = 0, (25) becomes the following linear program maximize

{p2i,j }i∈[Lj ],j∈[2]

s.t.

t

(26)

p2k1 ,1 + p2l1 ,2 + t < p2k2 ,1 + p2l2 ,2 , if p2k1 ,1 + p2l1 ,2 < p2k2 ,1 + p2l2 ,2 , Lj 1 X 2 p ≤ 1, ∀i, j ∈ [2]. Lj i=1 i,j

29

In the case of only LOS in both channels, i.e., µ1 = µ2 = 1, (25) reduces to the convex quadratic program maximize

t

{pi,j }i∈[Lj ],j∈[2]

s.t.

(27)

(pk1 ,1 + pl1 ,2 )2 + t < (pk2 ,1 + pl2 ,2 )2 , if (pk1 ,1 + pl1 ,2 )2 < (pk2 ,1 + pl2 ,2 )2 , Lj 1 X 2 p ≤ 1, and pi,j ≥ 0, ∀i, j ∈ [2]. Lj i=1 i,j

Therefore, in these special cases a way to approach the problem is to enumerate all possible total orderings that agree with the initial partial ordering (23) (referred to as linear extensions [27]) and keep the solution that gives the largest objective function. However, since the problem of generating the set of linear extensions is a NP-hard problem, we were able to identify the optimal total ordering only for small constellation sizes. For instance, consider the case of µ1 = µ2 = 0 and L1 = 3, L2 = 4. After enumerating all possible total orderings that agree with (23), two total orderings which lead to the best value for the objective function of (26) are depicted in Fig. 8. Optimal orderings have been identified through exhaustive search for all L1 ≤ 4, L2 ≤ 4. Fig. 9 depicts the minimum distance achieved for the total orderings depicted in Fig. 8, where we compare it with an upper 4 , obtained by considering a single super-user bound on the minimum distance, d¯min = L1 L2 −1

with L1 L2 constellation points and the same total power constraint.

s1,1$

s1,2$

s1,3$

s1,4$

s1,1$

s1,2$

s1,3$ s1,4$

s2,1$

s2,2$

s2,3$ s2,4$

s2,1$

s2,2$

s2,3$

s2,4$

s3,1$

s3,2$

s3,3$ s3,4$

s3,1$

s3,2$

s3,3$

s3,4$

Fig. 8: An arrow from sk1 ,l1 to sk2 ,l2 implies sk1 ,l1 ≤ sk2 ,l2 . The two total orderings in green lead to the same optimal constellation design for L1 = 3, L2 = 4, µ1 = µ2 = 0. Initial partial ordering is in red. 30

log10(d min)

0

L1 =4

−1

Achievable d¯min

−2

L1 =16 0

10

20 L2

30

Fig. 9: The minimum distance achieved by the two orderings shown in Fig. 8 for different constellation sizes.

2) General Case: In the general case, imposing a total ordering does not convexify the problem. Yet, it is still possible to solve the non convex quadratic program (25) with a starting point that results from the no LOS or only LOS solution to find a local maximum. This approach has been used in the numerical results in Section VII.

A PPENDIX E In this appendix, we show how to solve an instance of the design problem outlined in Subsection VII-D for the case of Rician statistics. We show in particular that collecting phase information of the received signals can play a major role in improving the SER performance or bringing down the number of antennas required to achieved a certain SER. The critical observation is the fact that for Rayleigh fading and AWGN channels, kyk2 is a sufficient statistic for decoding x from y. For general statistics with equiprobable signalling, a likelihood detector will outperform this detector. This likelihood detector for Rician fading looks like ! P X ky − j µj xj 1k2 + n log π(σ 2 + σj2 |xj |2 ) . argmaxx 2 P 2 σ + j σj |xj |2 j We plot the performance with this detector in Subsection VII-D.

31

A PPENDIX F In this appendix, we show how a time division scheme will always outperform a joint multiuser codebook design for symmetric Rayleigh fading, i.e. Rayleigh fading with the same average power for all users. We consider the following two design problems: •

A joint multiuser codebook design with a symmetric rate of



A single user constellation design with a rate of log2 (L).

log2 (L) m

per user.

The idea is that the single user constellation can be used in a TDMA scheme over m time slots, and we investigate whether it is advantageous to perform a joint multiuser design. As described in (9), the single user design problem is maximize 1 L

PL C 2 i=1 |ci,1 | ≤1

min

x1 ,x2 ∈C,x1 6=x2

|r(x1 ) − r(x2 )| .

The joint multiuser design problem (similar to the problem (12)) is min

maximize

˜ 1 6=x2 x1 ,x2 ∈C,x

˜ j }j∈[m] C,{p

1 L1/m

|r(x1 ) − r(x2 )|.

PL1/m i=1

|ci,j |2 ≤pj , for j∈[m] P j pj ≤1 pj ≥0

Note that the design is over C˜ which is a collection of L m−dimensional constellation points. ˜ we can define a C = {k˜ ˜ We observe that for such a constellation, For any feasible C, xk|˜ x ∈ C}. X |x|2 ≤ 1. x∈C

Thus the set of feasible r(˜ x) in the multiuser design problem is a subset of the achievable r(x) for the single user case. Thus the objective attainable using a single user design with TDMA can be no worse than a joint multiuser design with the same constraints. In other words, in Rayleigh fading with symmetric average power, for the same average rate, the time division scheme will always have an error exponent which is at least the same as that of the joint multiuser design. Intuitively, this is due to the fact that, from an error exponent point of view, the joint design problem is more “constrained” than the single user design problem for symmetric Rayleigh fading channels.

32

R EFERENCES [1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, pp. 3590–3600, 2010. [2] K. T. Truong and R. W. Heath, “Effects of channel aging in massive mimo systems,” Communications and Networks, Journal of, vol. 15, no. 4, pp. 338–351, 2013. [3] A. Pitarokoilis, S. K. Mohammed, and E. G. Larsson, “Effect of oscillator phase noise on uplink performance of large mu-mimo systems,” in Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on. IEEE, 2012, pp. 1190–1197. [4] V. I. Barousis, M. A. Sedaghat, R. R. M¨uller, and C. B. Papadias, “Massive antenna arrays with low front-end hardware complexity: An enabling technology for the emerging small cell and distributed network architectures,” arXiv preprint arXiv:1407.7733, 2014. [5] R. C. Daniels and R. W. Heath, “60 Ghz wireless communications: emerging requirements and design recommendations,” IEEE Vehicular Technology Magazine, vol. 2, no. 3, pp. 41–50, 2007. [6] S. K. Yong and C.-C. Chong, “An overview of multigigabit wireless through millimeter wave technology: potentials and technical challenges,” EURASIP Journal on Wireless Communications and Networking, vol. 2007, 2006. [7] A. Manolakos, M. Chowdhury, and A. J. Goldsmith, “Constellation design in an energy-based noncoherent massive SIMO system,” To be submitted to IEEE Transactions on Wireless Communications, 2014. [8] A. Goldsmith, Wireless communications.

Cambridge university press, 2005.

[9] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive mimo for next generation wireless systems,” arXiv preprint arXiv:1304.6690, 2013. [10] D. Warrier and U. Madhow, “Noncoherent communication in space and time,” 1999. [11] M. L. McCloud, M. Brehler, and M. K. Varanasi, “Signal constellations for noncoherent space-time communications.” [12] A. Barg and D. Y. Nogin, “Bounds on packings of spheres in the grassmann manifold,” IEEE Transactions on Information Theory, vol. 48, no. 9, pp. 2450–2454, 2002. [13] L. Zheng and D. N. C. Tse, “Communication on the grassmann manifold: A geometric approach to the noncoherent multiple-antenna channel,” IEEE Transactions on Information Theory, vol. 48, no. 2, pp. 359–383, 2002. [14] B. M. Hochwald and T. L. Marzetta, “Unitary space-time modulation for multiple-antenna communications in rayleigh flat fading,” IEEE Transactions on Information Theory, vol. 46, no. 2, pp. 543–564, 2000.

33

[15] W. Yang, G. Durisi, and E. Riegler, “On the capacity of large-mimo block-fading channels,” arXiv preprint arXiv:1202.0168, 2012. [16] S. Shamai and T. L. Marzetta, “Multiuser capacity in block fading with no channel state information,” IEEE Transactions on Information Theory, vol. 48, no. 4, pp. 938–942, 2002. [17] S. Murugesan, E. Uysal-Biyikoglu, and P. Schniter, “Optimization of training and scheduling in the non-coherent SIMO multiple access channel,” IEEE Journal on Selected Areas in Communications, vol. 25, no. 7, pp. 1446–1456, 2007. [18] N. Jindal and A. Lozano, “A unified treatment of optimum pilot overhead in multipath fading channels,” Communications, IEEE Transactions on, vol. 58, no. 10, pp. 2939–2948, 2010. [19] F. Rusek, A. Lozano, and N. Jindal, “Mutual information of iid complex gaussian signals on block rayleigh-faded channels,” Information Theory, IEEE Transactions on, vol. 58, no. 1, pp. 331–340, 2012. [20] I. C. Abou-Faycal, M. D. Trott, and S. Shamai, “The capacity of discrete-time memoryless rayleigh-fading channels,” Information Theory, IEEE Transactions on, vol. 47, no. 4, pp. 1290–1301, 2001. [21] M. C. Gursoy, H. V. Poor, and S. Verd´u, “The noncoherent rician fading channel-part i: structure of the capacity-achieving input,” Wireless Communications, IEEE Transactions on, vol. 4, no. 5, pp. 2193–2206, 2005. [22] A. Dembo et al., Large deviations techniques and applications.

Springer, 2010, vol. 38.

[23] A. Manolakos, M. Chowdhury, and A. J. Goldsmith, “Constellation design in noncoherent massive SIMO systems,” in IEEE Global Telecommunications Conference (GLOBECOM), 2014. [24] B. Knott, A. Manolakos, M. Chowdhury, and A. J. Goldsmith, “Benefits of coding in a noncoherent massive SIMO system,” To appear in Proceedings of IEEE ICC, 2015. [25] G. Pruesse and F. Ruskey, “Generating Linear Extensions Fast,” SIAM Journal on Computing, vol. 23, no. 2, pp. 373–386, 1994.

34