Multi-user pdf estimation based criteria for adaptive blind ... - PPGETI

Comment

Report 2 Downloads 33 Views

ARTICLE IN PRESS

Signal Processing 85 (2005) 1059–1072 www.elsevier.com/locate/sigpro

Multi-user pdf estimation based criteria for adaptive blind separation of discrete sources Charles Casimiro Cavalcantea,, Joa˜o Marcos T. Romanob a

GTEL/DETI/CT/UFC, C.P. 6005, Campus do Pici, CEP: 60.455-900, Fortaleza-CE, Brazil DSPCom/DECOM/FEEC/UNICAMP, C.P. 6101, CEP: 13.083-852, Campinas-SP, Brazil

b

Received 28 April 2004; received in revised form 27 September 2004

Abstract This paper deals with criteria for adaptive blind separation of discrete sources. The criteria are based on the estimation of the probability density function (pdf) of the recovered signal using a parametric model and the divergence of Kullback–Leibler to measure the similarities between the involved signals. Two strategies that guarantee the recovering of all sources are employed: the ﬁrst one introduces a penalty when the sources are correlated and the second one constrains the ﬁltering to an orthogonal global system response. Simulations are carried out to evaluate the performance of the criteria compared with existing blind methods in typical multi-user environments such as spatial and space-time processing. r 2005 Elsevier B.V. All rights reserved. Keywords: Blind source separation; Discrete sources; Pdf estimation; Kullback–Leibler divergence; Constrained ﬁltering

1. Introduction Blind source separation (BSS) has been gaining increasing attention in the signal processing community due to its wide applicability in many ﬁelds such as digital communications, biomedical engineering and ﬁnancial data analysis among others [14]. Since the milestone work by He´rault et al. in 1985 [13], much effort has been done in order to Corresponding author. Tel.:/fax: +55 85 3288 9470.

E-mail address: [email protected] (C.C. Cavalcante).

construct suitable statistical criteria that reﬂect some known structural properties of the sources [19]. A common characteristic of many criteria is the use of higher order statistics (HOS), since second order statistics (SOS) are not sufﬁcient to solve the source separation problem. The information-theoretic approach has been introduced by Donoho in [11], who has treated the BSS problem from an entropy minimization viewpoint. Another well-known method to solve BSS problems is the use of contrast functions introduced by Comon [8], is any non-linear function which is invariant to permutation and scaling

0165-1684/$ - see front matter r 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2004.11.023

ARTICLE IN PRESS 1060

C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

matrices, and attains its minimum value in correspondence of the mutual independence among the output components [19]. Clearly, the methods are based in some existing requirements for BSS such as linear and time-invariant mixture, as many sensors as sources and mutually independent sources. These works have provided important results on the issue of necessary and sufﬁcient conditions to blindly provide source separation. Despite the development of techniques that rely directly on HOS, some single user techniques, such as constant modulus (CM) and Shalvi–Weinstein criteria, have been proposed for BSS in a singlestage and multi-stage context [16,20]. Multiple-input multiple-output (MIMO) systems have received a lot of attention during the last decade, with several multiple access systems emerging during that time. Through BSS techniques, several strategies have shown promising results in the context of multiuser detection. To cite a few references, [9,17,21] reveal the potential of equalization strategies to source separation. Papadias proposed in [19,20] a source separation approach that is based on the Shalvi–Weinstein criterion. The proposal is called multiuser kurtosis and consists of the kurtosis maximization, constrained to an orthogonal global response. It has been a great advance on the ﬁeld of BSS because it has proved global convergence for an arbitrary number of users, which had not been done before. We have previously proposed a source separation criterion based on the estimation of the probability density function (pdf) of the ideally recovered signals [5]. The criterion uses the knowledge of the pdf of transmitted signals, as in [25], to construct a parametric model that retains the statistical properties of the source signals. Thus, the separation matrix is optimized in order to maximize the similarities between the pdf of the recovered signal and the parametric model. For this sake, we use the Kullback–Leibler divergence to measure these similarities. The criterion is then similar to the one presented in [22]; however we consider a much simpler stochastic approximation to derive an adaptive algorithm.

Our objective in this work is to propose multiuser methods based on the pdf estimation based criterion and use different strategies to ensure that all sources are recovered given some requirements (e.g., linear mixture, discrete sources). Two strategies are used for this task: (i) an auxiliary criterion that penalizes the estimated sources (outputs) that are correlated and (ii) a ﬁltering that constrains the obtained global response to be orthogonal. Also, applications in the context of multi-user processing are presented in order to evaluate the performance of the proposals comparing them with existing blind criteria. The rest of the paper is organized as follows. The system model for memoryless and convolutive systems is described in details in Section 2. In Section 3, the pdf estimation based blind criterion is revisited in the context of SISO systems. In Section 4, the criteria for adaptive blind source separation are presented in detail, showing the different strategies for multi-user consideration. Simulation results that illustrate the performance of the proposals in some typical situations are presented in Section 5. Finally, our conclusions are stated in Section 6. 2. System model 2.1. Blind source separation: general assumptions In the model we made the following assumptions [26]: (AS1) K sources with independent and identically distributed (i.i.d.) and mutually independent zero mean discrete sequences ak ðnÞ; k ¼ 1; . . . ; K: (AS2) MIMO linear channel. (AS3) At least as many sensors as sources. (AS4) Noise is zero-mean, ergodic, stationary Gaussian sequence independent of ak ðnÞ: If we consider M sensors in the receiver we can represent, for instantaneous mixtures, the received signal at time instant n as (1)

xðnÞ ¼ HaðnÞ þ vðnÞ, K

where aðnÞ ¼ ½ a1 ðnÞ aK ðnÞ is the vector of sources, H is the M K channel matrix, vðnÞ is the

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

1061

M 1 vector of additive noise and xðnÞ is the M 1 vector of received signals. The received signals are then processed by the MIMO equalizer given by the matrix

Through the use of a linear antenna array in the receiver, we can write the discrete time model for the received signal of the K users as [7]

WðnÞ ¼ ½ w1 ðnÞ

xðnÞ ¼

wK ðnÞ ,

(2)

K pﬃﬃﬃﬃﬃﬃ X Pk ak ðnÞ fðyk Þ þ vðnÞ,

(4)

k¼1

which produces a K 1 vector yðnÞ ¼ ½ y1 ðnÞ yK ðnÞ K that consists of the estimate of the sources. The receiver output can be mathematically written as yðnÞ ¼ WH ðnÞxðnÞ ¼ WH ðnÞHaðnÞ þ v0 ðnÞ ¼ GðnÞaðnÞ þ v0 ðnÞ, where the superscript transposition and

H

ð3Þ

stands for Hermitian

GðnÞ ¼ WH ðnÞH ¼ ½ g1 gK 2 3 g11 gK1 6 7 .. 7 .. 6 . ¼ 6 .. . 7 . 4 5 g1K

gKK

KK

is the global response matrix and v0 ðnÞ ¼ WH ðnÞvðnÞ is the ﬁltered noise at the receiver output. Fig. 1 depicts the above-described system. For the case of multi-user processing in practical applications such as wireless communications, some particular characteristics for the system are considered. This is the topic of next section. 2.2. Multi-user processing For multi-user processing we consider a space division multiple access (SDMA) scenario with several users transmitting simultaneously.

where Pk is the transmission power, ak is the transmitted sequence, vðnÞ is a vector of spatial additive white Gaussian noise (AWGN) samples. For simpliﬁcation reasons we assume that the signal comes from a preferential direction, thus yk is the direction of arrival (DOA) of the kth user and fðyk Þ ¼ ½ f 1 ðyk Þ f M ðyk Þ T is the array response vector, where the array phase response at the mth antenna is modeled as

2d pðm 1Þ sinðyk Þ f m ðyk Þ ¼ exp , (5) l where d is the inter-element spacing in the array and l is the carrier wavelength. We assume in this paper that d ¼ l2 : The channel (mixture matrix) for all users is then denoted by the matrix F ¼ ½ fðy1 Þ

fðyK Þ .

(6)

In order to detect/separate multiple users, adaptive spatial ﬁlters are provided to each desired user as shown in Fig. 2. This corresponds to the separation matrix in the general BSS scheme. Then, the output for the kth user, provided by the kth beamformer (spatial ﬁlter), denoted by wk ; is given at time instant n by yk ðnÞ ¼ wH k ðnÞxðnÞ.

Fig. 1. General blind source separation scheme.

(7)

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

1062

Fig. 2. Antenna array architecture for multi-user detection in memoryless systems.

Moreover, the vector of beamformers outputs is denoted by yðnÞ ¼ ½ y1 ðnÞ yK ðnÞ T : When intersymbol interference (ISI) is considered, we have for the received signal given by convolutive mixtures the following model: xðnÞ ¼

K LX k 1 pﬃﬃﬃﬃﬃﬃ X Pk ak ðiÞ ak ðn iÞ

is the diagonal matrix with the complex gains (amplitude and phase rotations) of the kth user channel. In order to simplify notation, we can use some special vectors. Then, the mixture signal can be given by XðnÞ ¼

k¼1 i¼0

fðyk;i Þ þ vðnÞ,

K X

(12)

Hk Ak ðnÞ þ VðnÞ,

k¼1

ð8Þ where

where ak ðiÞ is the channel coefﬁcient for ith path of the kth user that has a DOA yk;i and Lk is the number of multipaths for user k: In a matricial form, we can write the model for the mixtures [20]: pﬃﬃﬃﬃﬃﬃ Hk ¼ Pk Fðyk Þ ak , (9)

where N is the number of coefﬁcients in each space–time ﬁlter of the receiver;

where

Ak ðnÞ ¼ ½ ak ðnÞ

Fðyk Þ ¼ ½ fðyk;0 Þ fðyk;1 Þ

fðyk;Lk 1 Þ ML

xM ðnÞ T ,

ak ðn Lk N þ 2Þ TðL

ð13Þ

k þN 1Þ1

(14)

is a composite matrix with the steering vectors for the kth user and, ak ðLk 1Þ Þ

xðnÞ ¼ ½ x1 ðnÞ

xT ðn N þ 1Þ TMN1 ,

k

(10)

ak ¼ diagð½ ak ð0Þ ak ð1Þ

XðnÞ ¼ ½ xT ðnÞ

(11)

VðnÞ ¼ ½ vT ðnÞ and

vðnÞ ¼ ½ v1 ðnÞ

vT ðn N þ 1Þ TMN1

vM ðnÞ T .

ð15Þ

The matrix Hk is the channel convolution matrix of the space–time channel for the kth user

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

1063

Fig. 3. Space–time multi-user receiver structure.

given by [20] Hk

2

Hk ðnÞ

3

0MðN 1Þ

6 6 0M1 6 6 60 ¼ 6 M1 6 6 .. 6 . 4

0M1

Hk ðn 1Þ

0MðN 2Þ

Hk ðn 2Þ

0MðN 3Þ

..

0MðN 1Þ

Hk ðn N þ 1Þ

.

where 0kl is a k l matrix of zeros. The receiver structure, which consists of a set of space–time ﬁlters is represented in Fig. 3. Then, we can write the following expressions for the estimated sources and the receiver structure: yk ðnÞ ¼ WH (17) k ðnÞXðnÞ, where Wk ¼ ½ wTk0 wTk _n ¼ ½ wk;1;n_

wkðN 1Þ TMN1 , wk;M;n_ T

M1

ð18Þ

_

and n is the time index of the temporal ﬁlter associated to each antenna. 3. PDF estimation based blind criterion: SISO systems 3.1. Review

7 7 7 7 7 7 7 7 7 5

,

ð16Þ

MNðNþLk 1Þ

Let wideal be an ideal zero-forcing linear equalizer, its output can be written as yðnÞ ¼ wH ideal xðnÞ,

(19)

where xðnÞ ¼ HaðnÞ þ vðnÞ,

(20)

H is the N ðN þ L 1Þ convolution matrix of the single-user channel with impulse response of length L and N is the length of the impulse response of the equalizer [2]. Then, using Eq. (20) in (19), it is possible to write: yðnÞ ¼ ðHaðnÞ þ vðnÞÞH wideal ¼ aH ðnÞHH wideal þ vH ðnÞwideal ¼ aH ðnÞ HH wideal þvH ðnÞwideal |ﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄ} gideal

In [5] was proposed a blind criterion in the context of SISO systems (blind deconvolution). This method is revisited here.

H

¼ a ðnÞgideal þ WðnÞ ¼ aðn dÞ þ WðnÞ,

ð21Þ

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

1064

where gideal is the ideal system response (a ‘‘Dirac’’-type vector), d is an arbitrary delay and WðnÞ is a random variable (r.v.) assumed Gaussian [2]. Eq. (21) states that the pdf of the signal at the output of the ideal equalizer is a mixture of Gaussians given by

(23) yields pðyÞ ¼ pðyÞ ln dy Fðy; s2r Þ

1 Z 1 pðyÞ ln½pðyÞdy ¼

1 Z 1

pðyÞ ln½Fðy; s2r Þdy. Z

DpðyÞjjFðy;s2r Þ

1

ð24Þ

1

S X 1 jy ai j2 pY ;ideal ðyÞ ¼ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ exp

pðai Þ, 2s2W 2ps2W i¼1 (22) where the ai are the possible values (scalars) of aðn dÞ; assumed to belong to the transmitted alphabet A which has cardinality S. Since the pdf of the equalized signal is known, we desire to construct a criterion that forces the adaptive ﬁlter to produce signals with the same (or as similar as possible) pdf as the ideal one. This leads to the well-known measure of similarities between strictly positive functions (such as the pdfs), the Kullback–Leibler Divergence (KLD) [12]. In order to use the KLD to provide pdf estimation, a parametric model, which is function of the ﬁlter parameters, is constructed [2]. A natural choice is the same model of mixture of Gaussians like the one in Eq. (22) considering the estimate of the pdf of the sample yðnÞ: Then

jyðnÞ ai j2 exp

pðai Þ, 2s2r

J FPC ðwÞ ¼ Efln½Fðy; s2r Þg ( "

S X jyðnÞ ai j2 exp

¼ E ln A 2s2r i¼1 #) pðai Þ

.

ð25Þ

The Fitting pdf Criterion (FPC) criterion corresponds to minimizing J FPC ðwÞ: Furthermore, it is known that minimizing Eq. (25) corresponds to ﬁnding the entropy of y if Fðy; s2r Þ ¼ pY ðyÞ [1, p. 59]. A stochastic algorithm for ﬁlter adaptation is given by wðn þ 1Þ ¼ wðnÞ mrJ FPC ðwÞ Ps jyðnÞ ai j2 ðyðnÞ ai Þ i¼1 exp

2s2r rJ FPC ðwðnÞÞ ¼ x ; P S jyðnÞ ai j2 2 sr i¼1 exp 2s2

S X 1 Fðy; s2r Þ ¼ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2ps2 |ﬄﬄﬄ{zﬄﬄﬄ}r i¼1 A

This procedure aims to force pðyÞ to be ‘‘as close’’ as possible to the model given by Eq. (23) adapting the ﬁlter parameters. So, using Eq. (24) we can write the integrals as expectations over pðyÞ since such density has to be approximated to the structure given in Eq. (23). Minimizing (24) is equivalent to minimizing only the Fðy; s2r Þ-dependent term, and this is used to construct our cost function, that is

r

ð23Þ

(26)

is the chosen parametric model, where s2r is the variance of each Gaussian in the model. In the pattern classiﬁcation ﬁeld this kind of parametric functions, which are used to measure similarities against other functions, are called target functions [2]. Then, applying KLD to compare the pdf of the recovered signal (equalizer output) and

where m is the step size. The adaptive algorithm which uses the proposed criterion will be called fitting pdf algorithm (FPA). Eq. (26) shows an important property of the algorithm: it takes into account the phase of the transmitted symbols. The computational complexity of this algorithm is proportional to the computation of S exponentials which are required by Eq. (26). Thus, its

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

complexity is a little higher than other blind LMSlike algorithms [10]. In the sequel we discuss some issues of the selection of the cost function and similarities with criteria previously reported in the literature.

1065

Quadratic distance (QD)—the cost function is deﬁned using the quadratic distance between the pdfs to measure the similarities Z 1 JðwÞ ¼ ½pY 2 ðuÞ pA2 ðuÞ2 du, (28)

1

3.2. Discussion One important issue on the criterion design is the cost function selection. Indeed, as the KLD is a semi-distance, the order of the functions in the KLD impacts the resulting functional. If the inverse order on KLD were taken, we would obtain Z 1 Fðy; s2r Þ DFðy;s2r ÞkpðyÞ ¼ Fðy; s2r Þ ln dy pðyÞ

1 Z 1 ¼ Fðy; s2r Þ ln½Fðy; s2r Þdy

1 Z 1

Fðy; s2r Þ ln½pðyÞdy

1

¼ H½Fðy; s2r Þ E F fln½pðyÞg,

ð27Þ

where H½ stands for the entropy. In this case, if the selected cost function were the one represented in Eq. (27), the resultant algorithm from the minimization of DFðy;s2r ÞkpðyÞ would be computationally more complex due to the existence of two terms dependant on the parametric model. Clearly, the main motivation of retaining DpðyÞjjFðy;s2r Þ and not DFðy;s2r ÞkpðyÞ concerns the complexity of implementation. The analysis about possible differences on performance issues will be investigated in later works. On the other hand, other information-theoretic approaches have been employed in the context of equalization. We brieﬂy describe the major common points of them in the sequel. Firstly, it should be noted that the FPC can be viewed as a sort of Parzen estimation [18,23], with the center of the Gaussian kernels chosen to be the elements of the transmitted alphabet A: Using Parzen estimation for the density of the recovered data, in [18] some variations of the consideration of pdf ﬁtting are discussed. In summary, they use the following three approaches:

where Y 2 ¼ fjyðnÞj2 g and A2 ¼ fjai j2 g: Sampled pdf fitting—aims to ﬁt the pdf only at a number N p of representative points JðwÞ ¼

Np 1 X ½p 2 ðui Þ T i 2 , N p i¼1 Y

(29)

where T i ¼ pA2 ðui Þ: Matched pdf—the similarities between the desired pdf and the estimate are measured according to Z 1 JðwÞ ¼ p2Y 2 ðuÞp2A2 ðuÞ du. (30)

1

The resultant algorithms for the three abovementioned criteria are somewhat similar to the FPA since the Parzen window is the Gaussian one. In [15], a clustering method is proposed also based on measure of similarities between density functions. The measure is a sort of Cauchy–Schwartz measure between pdfs given as " # R pðxÞqðxÞdx ﬃ , JðwÞ ¼ ln pRﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ (31) R p2 ðxÞdx q2 ðxÞdx where pðxÞ and qðxÞ are pdfs of the involved signals. In this case, a Gaussian (Parzen) kernel is employed to estimate the density of recovered data, providing an adaptation procedure similar to the one in [18]. The work with major similarities is reported in [22]. In this case, a statistical reference criterion is derived based on the use of KLD. It inserts a regeneration function to provide a recursive estimation of the data using the consideration that the noise is assumed Gaussian. Despite the very similar structure of that criterion and the FPC, the algorithms from both are quite different. Another difference is the existence of an additional degree of freedom (parameter s2r ) in the FPC. In [22], the authors claim that the choice of the variance of the Gaussians in the model does

ARTICLE IN PRESS 1066

C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

not affect the performance of their algorithm. Particularly, the choice of such a parameter has been shown to play a key role in the cost function since the variance of the Gaussians can be used to make the estimate more robust, as discussed in details in [3]. The FPC has been generalized to the multi-user processing case as described in the next section.

proposed in the context of a multi-user constant modulus algorithm (MU-CMA) in [21]. This strategy was called explicit decorrelation by [7], a nomenclature adopted in this paper. This approach results in the following multiuser criterion [4]: J MU-FPC ðwk Þ ¼ J FPC ðwk Þ þ g

K X K X i¼1

4. Multi-user pdf estimation based criteria In order to generalize the criterion to the multiple sources case, we have used the conditions for signal recovering, that are directly stated from the Shalvi–Weinstein (SW) criterion. These conditions can be written as [19]: (C1) ak ðnÞ is i.i.d. and zero mean ðk ¼ 1; . . . ; KÞ; (C2) ak ðnÞ and aq ðnÞ are statistically independent for kaq and have the same pdf, (C3) jk½yk ðnÞj ¼ jka j ðk ¼ 1; . . . ; KÞ; (C4) Efjyk ðnÞj2 g ¼ s2a ðk ¼ 1; . . . ; KÞ; (C5) Efyk ðnÞyq ðnÞg ¼ 0; kaq; where ak ðnÞ is the sequence transmitted by the kth source, ka is the kurtosis, s2a is the variance of the transmitted sequence, and k½ is the kurtosis operator. Conditions (C1)–(C4) are the direct interpretation of the SW theorem [24] for signal recovering and Condition (C5) guarantees achievement of all the sources involved. However, other criteria can be used to perform the signal recovering (or separation) than kurtosis maximization proposed in [24]. Since the pdf estimation approach aims, indirectly, for the equalization of all higher order statistics, FPC replaces the kurtosis maximization for interference removal. Furthermore, the strategy for making Condition (C5) hold deﬁnes multi-user criteria as described in the sequel. 4.1. Explicit decorrelation The ﬁrst criterion uses the strategy of a penalization term that measures the cross-correlation of the beamformers outputs. It was initially

jrij j2 ,

(32)

j¼1 jai

where g is a regularization parameter and rij is the correlation term between the outputs of the ith and jth beamformers. The criterion described in Eq. (32) is called multi-user fitting pdf criterion (MU-FPC). The adaptation procedure of the MU-FPC for the kth user in a memoryless system (spatial processing) is then wk ðn þ 1Þ ¼ wk ðnÞ mrJ FPC ðwk Þ

g

K X i¼1 iak

b rik ðnÞpi ðnÞ,

ð33Þ

where b rik ðnÞ is the ði; kÞ element of the matrix Ry ðnÞ and pi ðnÞ is the ith column of matrix PðnÞ; which are computed as Ry ðn þ 1Þ ¼ oRy ðnÞ þ ð1 oÞyðnÞyH ðnÞ Pðn þ 1Þ ¼ oPðnÞ þ ð1 oÞxðnÞyH ðnÞ,

ð34Þ

where o is a forgetting factor. Eqs. (33) and (34) deﬁne the multi-user fitting pdf algorithm (MUFPA). For space–time multi-user processing it is also necessary to guarantee decorrelation of the signals in a time interval in order to remove the ISI. Thus, the criterion becomes [4] J MU-FPC ðWk Þ ¼ J FPC ðWk Þ D

þg

K X K X 2 X i¼1

j¼1 jai

jrij ð‘Þj2 ,

ð35Þ

‘¼ D2

where rij ð‘Þ ¼ Efyi ðnÞyj ðn ‘Þg is the cross-correlation between the ith and jth outputs from the space–time receivers with a time interval ‘; and D2 is the maximum delay for which the signals from the several users must be decorrelated.

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

The MU-FPA, for the space–time processing, is then given by [3]: Wk ðn þ 1Þ ¼ Wk ðnÞ mrJ FPC ðWk Þ D

g

K X 2 X i¼1 iak

‘¼ D2

b pi;‘ ðnÞ, rik;‘ ðnÞb

ð36Þ

Ry;‘ ðn þ 1Þ ¼ oRy;‘ ðnÞ þ ð1 oÞyðnÞyH ðn ‘Þ, (37) P‘ ðn þ 1Þ ¼ oP‘ ðnÞ þ ð1 oÞXðnÞyH ðn ‘Þ, yðn ‘Þ ¼ ½ y1 ðn ‘Þ

yK ðn ‘Þ T ,

(38) (39)

D D ;...; , (40) 2 2 where b rik;‘ ðnÞ is the cross-correlation between the ith and jth users estimate with time delay ‘; given by the ði; jÞth element of matrix Ry;‘ ðnÞ; and b pi;‘ ðnÞ is the ith column of matrix P‘ ðnÞ: This criterion and its stochastic adaptive algorithm present good results but suffer from the strong trade-off between the number of different recovered sources (successful recovering) and steady state error w.r.t. the parameter g; due to the use of a penalization term in the multi-user criterion. An alternative to this strategy aiming to cope with the problem of losing users (unsuccessful recovering) and a good steady state error is presented in next subsection. ‘¼

4.2. Constrained criterion Using conditions (C1)–(C5) [19] proposed a multi-user criterion based on the maximization of the kurtosis of the recovered signals named multiuser kurtosis (MUK) maximization. The recovering of all sources is ensured by a constraint that forces global response to be an orthogonal matrix. The criterion divides the separation task into two parts: the equalization step, that maximizes the kurtosis, and the separation one, that performs the decorrelation of the outputs. For each task, we denote We for the equalization part, and W for the later one. The separation is performed by means of

1067

a Gram–Schmidt orthogonalization of matrix We [19]. However, this criterion requires a pre-whitening process on the received signals in order to assure that the variance will also be equalized. As we have mentioned before, the use of such ‘‘decoupled’’ processing makes possible the use of equalization techniques to obtain source separation, constraining the global system response to be orthogonal. The key point is that the considered equalization criterion has to respect the necessary and sufﬁcient conditions for blind signal recovering. We have noticed that the FPC can be viewed as a general case of the Shalvi–Weinstein one. This is due to the fact that equalizing the pdfs of signals corresponds to equalizing all higher-order moments (cumulants) of the signals, and matching all cumulants comprises the kurtosis one, as stated by Shalvi–Weinstein and MUK criteria. Indeed, we beneﬁt from the parametric form of pdf estimation provided by the FPC to cope with the high computational load required to match all cumulants between the input and output of the system. Since matching all cumulants implies matching the kurtosis, we can replace the pdf estimation procedure with the kurtosis maximization, in the MUK criterion, and the set of conditions is still valid as necessary and sufﬁcient conditions for blind source separation. With this approach, we replace the processing with a penalization term that is performed by the MU-FPA, see Eq. 32, by a constrained procedure aiming at improving the performance. Finally, the criterion is given by 8 K P > < min J FPC ðWÞ ¼ DpY ðyÞkFðyj ;s2r Þ ; W

> :

j¼1

(41)

H

subject to: G G ¼ cI;

where c is an arbitrary constant and I is the identity matrix. Since, there is no pre-whitening performed, the unit variance of the received signals cannot be assured. Furthermore, when the pdfs are equalized all the statistics have to be equal. Then, it is not necessary an orthonormalization and a scaled and permuted version of the sources will be achieved as for BSS strategies [14].

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

1068 Table 1 MU-CFPA

5.1. Spatial multi-user processing

1. 2. 3. 4. 5. 6.

An SDMA system with four users is used as the evaluation scenario of the algorithms. In the receiver, we use a linear antenna array with eight elements. The DOAs for the users are the following: 1 ; 52 ; 29 and 76 : In each direction, 2000 QPSK unit power symbols are transmitted over 100 Monte Carlo trials. In this environment, we have assumed that power control is used to assure the same received power to all users, and the signal-to-noise ratio (SNR) for each sensor equals 30 dB. In order to evaluate the performance of the proposed algorithm we use the constant modulus error (CME) deﬁned for the kth user as follows:

Initialize Wð0Þ for n40 Obtain We ðn þ 1Þ from Eqs. (42) and (26) Obtain w1 ðn þ 1Þ ¼ we1 ðn þ 1Þ for j ¼ 2 : K Compute

wj ðn þ 1Þ ¼ wej ðn þ 1Þ

j 1 X e ðwH l ðn þ 1Þwj ðn þ 1ÞÞwl ðn þ 1Þ l¼1

7.Go to 5 8. Go to 2

Thus, the new adaptive algorithm is obtained by replacing the step of kurtosis maximization by the following computation: We ðn þ 1Þ ¼ WðnÞ mw rJ FPC ðWðnÞÞ,

(42)

where rJ FPC ðWðnÞÞ is given in Eq. (26) and performing just an orthogonalization in matrix We by means of the Gram–Schmidt procedure. The resulting algorithm is then called multiuser constrained FPA (MU-CFPA) and is summarized in Table 1. In terms of computational complexity, the MUCFPA is an LMS-like algorithm, although the orthogonalization procedure slightly increases the complexity of the FPA. However, the MU-CFPA can be applied only on memoryless systems. In the case of the space–time processing, the global response is a tensor and its orthogonalization is more complex than the matrix one.

5. Computational simulations This section demonstrates the performance of the proposed multi-user criteria, which is compared to blind separation criteria found in the literature. We have chosen three common situations in multi-user processing in order to evaluate the performance over different environments. They are described in the sequel.

CMEk ðnÞ ¼ ðjyk j2 RÞ2 .

(43)

The constant R is related to the power of the transmitted constellation. In our case we will assume a normalized power, i.e., R ¼ 1: Firstly, we have investigated the gain in terms of steady state error when we use two different strategies for obtaining all sources, namely the MU-FPA and MU-CFPA. For the environment previously described, we have simulated both algorithms and observed the temporal evolution of the CME. The simulation parameters for the algorithms are: m ¼ 10 3 ; g ¼ 10 3 ; s2r ¼ 0:1 chosen to achieve a CME as low as possible and initializations are performed as Wð0Þ ¼ We ð0Þ ¼ Ry ð0Þ ¼ Pð0Þ ¼ I: For the MU-FPA, the forgetting factor used in the correlation matrices estimation is set to o ¼ 0:96: Fig. 4 shows the comparison of the average CME for both algorithms. As one can see, the MU-CFPA outperforms the MU-FPA in terms of steady state error, where the difference of performance between the two algorithms is about 10 dB. This is due to the dropping of the decorrelation term in Eq. (32), which eliminates the trade-off between steady state error and number of lost users [7]. In the case of MUCFPA, this term is not taken into account, and the decorrelation procedure is done by means of the orthogonalization of the separation matrix, which improves the steady state performance. It is worth mentioning that g was chosen to ensure successful

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072 15 10 5

CME [dB]

0 −5 −10 −15

MU-FPA

−20 10 dB

−25 −30 MU-CFPA

−35

0

250

500

750

1000

Iterations Fig. 4. Performance evaluation of MU-FPA MU-CFPA:

1069

computed over 20 independent runs (Monte Carlo trials). Since we deﬁne convergence as the necessary number of iterations to reach the lower value of CME (or another metric), we observe that the MU-CFPA (around 300 iterations) outperforms the MU-CMA (around 2000 iterations) in terms of convergence rate and steady-state error due to the use of different decorrelation methods. The MUCMA step size was taken as the highest one that achieves convergence. This behavior of faster convergence is due to the considered number of higher order statistics in the criteria that allow a faster estimation of the signal pdf since the MUFPA uses much more HOS than the MU-CMA [3,6].

5.2. Instantaneous mixtures

15 10 5

CME [dB]

0 −5 MU-CMA

−10 −15 −20 MU-CFPA

−25 −30 −35 0

500

1000

1500

2000

2500

3000

3500

4000

Iterations Fig. 5. Temporal evolution of the average constant modulus error for multi-user processing K ¼ 4 and M ¼ 8:

recovering of all sources without replication of a speciﬁc user (or users). Thus, we use the MU-CFPA instead of the MUFPA for comparing the criterion with other known blind multi-user algorithms, namely the MUCMA [21]. For this sake, we use the same setup and the simulation parameters for MU-CFPA described previously. The simulation parameters are: mMU-CFPA ¼ 4 10 3 ; mMU-CMA ¼ 10 3 ; Wð0Þ ¼ Ry ð0Þ ¼ Pð0Þ ¼ I; o ¼ 0:96 and g ¼ 10 3 : Fig. 5 shows the temporal evolution of the average CME for MU-CFPA and MU-CMA

In this experiment, we have taken memoryless mixture matrices (instantaneous ones) for evaluating the behavior of BSS algorithms on this environment. We have considered K ¼ 4 users transmitting QPSK unit power signals and M ¼ 6 sensors in the receiver. The noise is inserted in each sensor with power given by SNR ¼ 15 dB: The mixture matrix is complex valued and drawn randomly from a Gaussian zero mean and unit variance distribution for each of 100 independent trials. Fig. 6 shows the temporal evolution of the average CME for the considered algorithms. We have also included the MUK algorithm, and principal component analysis (PCA) is performed for this algorithm since pre-whitening is required. Simulation parameters are: mMUK ¼ mMU-CMA ¼ 4 10 3 ; mMU-CFPA ¼ 10 2 ; o ¼ 0:98; g ¼ 10 3 ; s2r ¼ 0:2 and Wð0Þ ¼ We ð0Þ ¼ Ry ð0Þ ¼ Pð0Þ ¼ I: One can easily observe that the random behavior of the mixture makes the MU-CFPA achieve the ﬁnal CME at the same order of the other methods, however, it outperforms the other blind criteria w.r.t. the convergence rate. Again, this behavior is due to the use of a different number of higher order statistics by the criteria. Considering all HOS provides the observed gain [3]. The MU-CFPA was used instead of the

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

1070 0 −2 −4

MUK

CME [dB]

−6 −8

MU-CMA

−10

MU-CFPA

−12 −14 −16 −18

0

1000

2000

3000

4000

Iterations Fig. 6. Temporal evolution of the average constant modulus error for blind source separation of random channels K ¼ 4 and M ¼ 6:

Table 2 Space–time system setup User

Path 1

Path 2

Delay (T)

DOA (rad)

Delay (T)

DOA (rad)

#1

0.1

1.1

#2

0.4

2p 5 p 7

1.2

p 3 p

6

MU-FPA due to its better performance in this case of study.

A continuously trained LMS algorithm is included in this comparison as a reference for the blind algorithms. In this case, the transmitted sequences from all users are known at the receiver to update the ﬁlters making a data-aided processing. It should be noted that the MU-CFPA is not used because it does not have a known space–time version. For each algorithm, we have measured the normalized residual interference (RI), which is deﬁned as "P " " jgk;i ðnÞj maxi jgk;i ðnÞj" i , (44) RIk ðnÞ ¼ maxi jgk;i ðnÞj where gk;i is the ith component of the vector gk ðnÞ ¼ WH k H and H is a composite channel matrix given by H ¼ ½ H1 H2 HK : This ﬁgure of merit indicates the amount of residual interference that exists in each case when compared to a zero-forcing optimal solution. Fig. 7 shows the temporal evolution of the average RI for the considered algorithms (MU-CMA and MU-FPA) under the scenario described above. The curve is an average over both users and 200 independent trials of the experiment. Simulation parameters are: mMU-FPA ¼ 10 3 ; mMU-CMA ¼ 5 10 3 ; mLMS ¼ 10 2 ; o ¼ 0:99; s2r ¼ 0:1 and Wk ð0Þ ¼ Ry;‘ ð0Þ ¼ P‘ ð0Þ ¼ I: It can be seen that the MU-FPA converges much faster than the MU-CMA and slightly slower than the LMS in the initial acquisition phase. Similar performance 10

5.3. Space– time multi-user processing 5

MU-CMA

RI [dB]

In this case, we consider K ¼ 2 users transmitting simultaneously unit power QPSK signals. We also consider Lk ¼ 2 for the users and their relative delays (in terms of the symbol interval T), and DOAs are shown in Table 2. Additive noise is inserted in each antenna that composes the receiver with an SNR ¼ 30 dB: Furthermore, the pulse shape is a raised cosine with roll-off 0.35. At the receiver, each sensor has temporal ﬁlters with N ¼ 2 coefﬁcients. We also consider D ¼ 2 delayed samples for temporal decorrelation of the signals. Also, we assume that both users are recovered at a decision delay ‘ ¼ 0:

0

MU-FPA 5

−10

LMS −15 0

500

1000

1500

2000

2500

3000

Iterations Fig. 7. Averaged residual interference for space–time multiuser processing.

ARTICLE IN PRESS C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

superiority of MU-FPA was veriﬁed over a large range of multiuser scenarios where different number of sources, relative powers, relative delays and angle displacements were considered.

1071

Acknowledgements The authors would like to deeply thank the anonymous reviewers for their comments and suggestions that contributed a lot to the manuscript. They also thank Dr. Renato R. Lopes by his careful proofreading and comments.

6. Conclusions and perspectives In this paper we have presented a two blind criteria based on the estimation of the probability density function of the recovered signals for adaptive blind source separation. The criterion of interference removal is based on the maximization of the similarity between the recovered signal probability density function and a parametric model that ﬁts the system order and transmitted signal statistical characteristics. The Kullback–Leibler divergence is then used to derive the algorithm. Two different strategies that guarantee the recovering of the different sources involved in the process are used to construct multi-user approaches. In the ﬁrst, a penalization term is inserted to force the recovered sources to be uncorrelated with each other. The latter employs a procedure that constrains the global system matrix being orthogonal so that the different fonts can be recovered. By evaluating the proposals in typical environments of multi-user communications, we have observed a better convergence rate and steady state error from our approaches compared to other existing blind criteria for the same purpose. A direct extension of this work is to derive a space–time version of the constrained algorithm using tensors to model the system and signals involved. Furthermore, an analytical analysis of the inﬂuence of the number of considered higher order statistics on the criteria is under development. Another direction is the analytical and computational comparison of different measures than KLD one to similarities between functions and different directions to KLD, as mentioned in Section 3. Afterwards, the inclusion of nonparametric estimation using kernels is envisaged to extend the use of the proposed methods for problems when the density of the sources is not available, such as in biomedical processing.

References [1] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, UK, 1995. [2] C.C. Cavalcante, Neural prediction and probability density function estimation applied to blind equalization, Master’s Thesis, Federal University of Ceara´ (UFC), Fortaleza, CE - Brazil, February 2001. [3] C.C. Cavalcante, On blind source separation: proposals and analysis of multi-user processing strategies, Ph.D. Thesis, State University of Campinas (UNICAMP) – DECOM, Campinas, SP - Brazil, April 2004. [4] C.C. Cavalcante, F.R.P. Cavalcanti, J.C.M. Mota, Adaptive blind multiuser separation criterion based on loglikelihood maximisation, IEE Electron. Lett. 38 (20) (2002) 1231–1233. [5] C.C. Cavalcante, F.R.P. Cavalcanti, J.C.M. Mota, A PDF estimation-based blind criterion for adaptive equalization, in: Proceedings of the IEEE International Symposium on Telecommunications (ITS 2002), Natal, Brazil, 2002. [6] C.C. Cavalcante, J.C.M. Mota, J.M.T. Romano, Polynomial Expansion of the Probability Density Function About Gaussian Mixtures, in: Proceedings of the IEEE Workshop on Machine Learning for Signal Processing (MLSP 2004), Sa˜o Luı´ s, Brazil, 2004. [7] F.R.P. Cavalcanti, J.M.T. Romano, Blind multiuser detection in space division multiple access systems, Ann. des Te´le´commun. (7–8) (1999) 411–419. [8] P. Comon, Independent component analysis: a new concept?, Signal Process. 36 (3) (1994) 287–314. [9] N. Delfosse, P. Loubaton, Adaptive blind separation of independent sources: a deﬂation approach, Signal Process. 45 (1995) 59–83. [10] Z. Ding, Y.G. Li, Blind Equalization and Identiﬁcation, Marcel Dekker, New York, USA, 2001. [11] D. Donoho, On Minimum Entropy Deconvolution, Academic Press, New York, 1981, pp. 565–608. [12] S. Haykin (Ed.), Unsupervised adaptive ﬁltering, Source Separation, vol. I, Wiley, New York, 2000. [13] J. He´rault, C. Jutten, B. Ans, De´tection de grandeurs primitives dans un message composite par une architecture de calcul neuromime´tique en apprentissage non supervise´, in: Actes du Xe´me Colloque GRETSI, Nice, France, 1985, pp. 1017–1022. [14] A. Hyva¨rinen, E. Oja, J. Karhunen, Independent Component Analysis, Wiley, New York, 2001.

ARTICLE IN PRESS 1072

C.C. Cavalcante, J.M.T. Romano / Signal Processing 85 (2005) 1059–1072

[15] R. Jenssen, J.C. Prı´ ncipe, T. Eltoft, Information cut and information forces for clustering, in: Proceedings of the IEEE Workshop on Neural Networks for Signal Processing (NNSP2003), Toulose, France, 2003, pp. 459–467. [16] J.-L. Lacoume, P.-O. Amblard, P. Comon, Statistiques d’Ordre Supe´rieur pour le Traitement du Signal, (Traitement du Signal), Masson, Paris, 1997. [17] S. Lambotharan, J.A. Chambers, A.G. Constatinides, Adaptive blind retrieval techniques for multiuser DSCDMA signals, IEE Electron. Lett. 35 (9) (1999) 693–695. [18] M. La´zaro, I. Santamarı´ a, D. Erdogmus, K.E. Hild, II, C. Pantaleo´n, J. C. Prı´ ncipe, Stochastic blind equalization methods based on PDF ﬁtting using parzen estimator, IEEE Trans. Signal Process., to appear. [19] C.B. Papadias, Globally convergente blind source separation based on a multiuser kurtosis maximization criterion, IEEE Trans. Signal Process. 48 (12) (2000) 3508–3519. [20] C.B. Papadias, Blind Separation of Independent Sources Based on Multiuser Kurtosis Optimization Criteria, vol. 2, Wiley, New York, 2000, pp. 147–179 (Chapter 4). [21] C.B. Papadias, A.J. Paulraj, A constant modulus algorithm for multiuser signal separation in presence of delay

[22]

[23]

[24]

[25]

[26]

spread using antenna array, IEEE Signal Process. Lett. 4 (6) (1997) 178–181. J. Sala-Alvarez, G. Va´zquez-Grau, Statistical reference criteria for adaptive signal processing in digital communications, IEEE Trans. Signal Process. 45 (1) (1997) 14–31. I. Santamarı´ a, C. Pantaleo´n, L. Vielva, J.C. Prı´ ncipe, Adaptive Blind Equalization Through Quadratic PDF Matching, in: Proceedings of the XI European Signal Processing Conference (EUSIPCO 2002), vol. II, Toulouse, France, 2002, pp. 289–292. O. Shalvi, E. Weinstein, New criteria for blind deconvolution of nonminimum phase systems (Channels), IEEE Trans. on Information Theory 36 (2) (1990) 312–321. K. Torkkola, Blind signal separation in communications: making use of known signal distributions, in: Proceedings of the 1998 IEEE DSP Workshop, Bryce Canion, UT, USA, 1998. J.K. Tugnait, Adaptive blind separation of convolutive mixtures of independent linear signals, Signal Process. 73 (1999) 139–152.

Recommend Documents

Blind adaptive multiuser detection based on ... - Semantic Scholar

reduced complexity nlms algorithm for blind adaptive multiuser detection

NEW ALGORITHM FOR BLIND ADAPTIVE EQUALIZATION BASED ...

New Criteria for Blind Equalization Based on PDF Fitting