IEEE SIGNAL PROCESSING LETTERS, VOL. 20, NO. 7, JULY 2013
709
Blind Separation of Dependent Sources With a Bounded Component Analysis Deflationary Algorithm Pablo Aguilera, Student Member, IEEE, Sergio Cruces, Senior Member, IEEE, Iván Durán-Díaz, Member, IEEE, Auxiliadora Sarmiento, Member, IEEE, and Danilo P. Mandic, Fellow, IEEE
Abstract—The problem of blind source separation of complexvalued sources from a linear mixture is addressed. We propose a deflationary algorithm for the sequential recovery of a set of communication signals, where each source is extracted by performing a Bounded Component Analysis of the linear mixture. The contribution of each recovered source to the observations is removed by minimizing its convex perimeter, without using second-order statistics. This implies to run a gradient descent algorithm several times. In order to accelerate the convergence, we have derived a fast step size that exploits the second-order information of the cost function by means of the augmented Hessian matrix. Computer simulations show that the proposed method is able to blindly separate even dependent sources, as long as they satisfy the BCA separability conditions. Also, the speed of convergence of this novel step size is compared with other classical approaches. Index Terms—Augmented Hessian matrix, blind signal separation, bounded component analysis, independent component analysis, step size.
I. INTRODUCTION
T
HE problem of Blind Source Separation (BSS) aims at the recovery of a set of unknown signals that have been mixed, with the only knowledge of their general statistical or structural properties. The separation is done by executing several times an Blind Source Extraction (BSE) algorithm. To prevent the algorithm to converge to the same source in each extraction, at least two paradigms have been proposed in the literature [1]. On one hand, symmetric decorrelation methods execute several source extraction algorithms in parallel, whose solutions are usually stationary points of the cost function [2]. However, the absence of spurious (i.e., non-separating) local extrema is not easy to guarantee even under perfect conditions, due to the multimodal shape of the cost functions. On the other hand, the sequential extraction methods aims to recover one source at a time, with an intermediate stage between recoveries which consists on the deflation of the contribution of
Manuscript received January 31, 2013; accepted April 21, 2013. Date of publication April 24, 2013; date of current version May 28, 2013. This work was supported in part by Projects TEC2011-23559 and TIC-7869. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Shahram Shahbazpanahi. The authors are with the Departimento de Teoría de la Señal y Comunicaciones, University of Seville, 41092-Seville, Spain and also with the Electrical and Electronic Engineering Department, Imperial College, London SW7 2AZ, U.K. (e-mail:
[email protected];
[email protected];
[email protected];
[email protected];
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LSP.2013.2259814
the extracted source to the set of observable signals [3]. The main benefit of this method is that the absence of spurious extrema in the cost function can be theoretically proved in some situations [4], thus allowing the design of iterative algorithms with global convergence [5], [6]. In [7], a BSE algorithm based on the Bounded Component Analysis (BCA) of the observations was presented. It is circumscribed within the works which are based on geometrical properties of the sources to recover them from the unknown mixture (see, e.g., [8], [6] and references therein). For linear mixtures of bounded sources, it can be proved that the assumption of mutual independence is not necessary. Instead, we use a weaker hypothesis based on the domain separability of the sources. In this work, we propose the separation of the sources by means of their sequential recovery by using the BCA-based extraction algorithm proposed in [7], which minimizes the normalized convex perimeter of the output. After each successful extraction we minimize, again by means of the convex perimeter, the contribution of the given source to the observations. Since the proposed method implies several runs of a gradient descent algorithm, in order to achieve a fast convergence we also propose a step size based on the complex augmented Hessian of the cost function. The use of augmented information in the problem of BSS was first introduced by [9]. This paper is organized as follows. In Section II, the model and the main hypothesis are outlined. Section III presents the extraction and deflation algorithms for the sequential recovery of the sources. In Section IV we present an estimate of the gradient of the cost function, which is used in Section V to derive the proposed step size. Computer simulations illustrate the performance of the proposed method in Section VI. Finally, Section VII summarizes the conclusions to this paper. II. MODEL AND HYPOTHESES OF BCA Let us consider a linear instantaneous mixing model in the absence of noise. A set of complex-valued sources, , is linearly mixed yielding the vector of observations denoted by , (1) is the mixing matrix (with ), and its where can be estimated by means columns are . The sources of the Bounded Component Analysis of the observations under is full the following assumptions: the sources are bounded, column-rank, and the convex support of the sources can be
1070-9908/$31.00 © 2013 IEEE
710
IEEE SIGNAL PROCESSING LETTERS, VOL. 20, NO. 7, JULY 2013
written as the Cartesian product of the convex supports of the individual sources. This later hypothesis plays the role of the mutual independence condition that is necessary in the ICA framework. As stated in [10], for mutually independent and bounded sources, the last hypothesis is automatically satisfied (although the reverse is not true). III. A DEFLATIONARY METHOD BLIND SOURCE SEPARATION
The minimization is achieved by means of a gradient descent algorithm, whose iterations are defined by the update equation in augmented notation [12], (6) is the iteration index, where the augmented extraction vector,
is is the step size, and
is the augmented gradient with respect to the extraction vector.
FOR
The blind source separation from the linear mixture (1) consists on the determination of a vector . The entries of that vector are estimates of each one of the original sources, up to ambiguities like the multiplication by a complex scalar and the possible permutation of the signals. Let us introduce the whitening matrix , which decorrelates the observations, and the extraction vectors , which recovers the sources from the mixture, (2) The vector represents the whitened observations, i.e., , where is the identity matrix of order . To prevent the outputs from converging to the same source, we adopt a deflationary scheme, where the sources are extracted sequentially. This is done by two alternating stages: and of the 1) The determination of the whitening matrix extraction vector , by using (2). 2) The removal of the contribution of the extracted source to the observations after its estimation, (3) is the contribution, and is where the an estimate of the corresponding column of the unknown mixing matrix (this issue is addressed in Section III.B). The removal of the estimated source from the observations implies a dimensional reduction of the problem, . By repeating this sequential process of whitening, extraction and deflation times, the whole set of bounded sources is recovered up to the above mentioned ambiguities.
B. Identification of a Column of the Mixing Matrix After the successful extraction of a bounded source, its contribution has to be removed from the observations. Let us define the -th remainder vector , which is obtained when the current estimated component is subtracted from the observations, (7) The -th mixing vector is obtained by means of the component-wise minimization ( ) of the convex perimeters of the elements of the remainder vector, (8)
IV. AN ESTIMATE OF THE GRADIENT OF THE COST FUNCTION In order to implement a gradient descent algorithm, the gradient of the cost function (5) has to be derived. In practice, the estimation of the convex perimeter of the output, , is computed from the available samples of . Let us define as the set of samples that are located at the vertices of , which denotes the convex hull of the support set of output. If these samples correspond to the temporal indexes , , then (9) closing the set of samples. with So the estimate of the convex perimeter is given by the sum of the length of the edges between consecutive vertices, (10)
A. Blind Extraction of a Bounded Source The extraction of one source is achieved by solving the following optimization problem (4)
where , for . The gradient of the cost function can be estimated by applying the rules of complex derivation,
where the cost function
(11) (5)
is the normalized convex perimeter of the output. In [7], it was shown that this scale and phase invariant cost function is free of spurious minima, so each local minimum always corresponds to the extraction of one source. The source code of the BCA extraction algorithm can be found in [11].
V. A STEP SIZE THAT EXPLOITS THE SECOND ORDER INFORMATION Since for the extraction of the whole set of sources is necessary to run the gradient descent algorithm several times, we have derived a step size that provides a fast convergence. This
AGUILERA et al.: BLIND SEPARATION OF DEPENDENT SOURCES
711
step size exploits the information about the local shape of the cost function. By means of a second order approximation we analyze the behavior of the cost function in the vicinity of a local minimum, i.e., when the value has been achieved. From [13, Appendix A.2.3.1], the value of the cost function at the ( )-th iteration can be approximated by the second order Taylor expansion
(12) where (13) is the augmented Hessian, with Fig. 1. First experiment. The sequential recovery of the three sources (upper are row) from the noisy observations (left column) is shown. The vectors the contribution of the above estimated sources into the observations.
(14)
dealing with dependent sources. We considered a communication scenario with the transmission of bounded signals with unit power, across a flat fading Rayleigh channel with Additive White Gaussian Noise (AWGN), and receiver antennas. The elements of the mixing matrix were randomly generated from a complex Gaussian distribution of zero mean and unit variance. We conducted three experiments to illustrate the main contributions of this work. A. Separation of Sources With Different Constellations
(15) Since we are interested in attaining the local minimum we by , and the update (6) is substituted into replace (12) to obtain
(16) For a given iteration , the solution to this quadratic equation is given by
In the first scenario, and the Signal to Noise Ratio (SNR) was set to 40 dB. The first two sources had a 16-QAM constellation and a correlation factor of 0.6, while the third source was statistically independent and had a 32-QAM constellation. The length of the sequences was . We ran 100 iterations of the proposed method for extracting all the original sources. In Fig. 1, the result of the BSS sequential algorithm is shown. The three sources were successfully recovered, up to permutations and phase ambiguities. The summation of the hidden bounded components conformed the set of observations . B. Blind Source Separation of Dependent Signals
(17) where is a measure of the current function excess over the local minimum. This practical adaptive step size can be directly used into the update rule (6) to speed up the convergence (reducing the required number of iterations). When one does not know the value of , we advise to use a coarse initialization and progressive annealing of along the run of the iterations. VI. SIMULATIONS In this section, we present computer simulations, in order to illustrate the performance of the proposed method when
In the second experiment, we mixed three QPSK sources to obtain three noisy observations (SNR goes from 0 to 50 dB). We compared the performance of the proposed method with other BSS techniques like ThinICA [14], JADE or FastICA [1] with a pow3 non-linearity. The number of samples was set to , and dependent sources (with a correlation coefficient of 0.5) were considered. Given the global transfer matrix , the Amari Index (AI) [15] is used to measure the quality of the separation (lower is better). The result of the median values of 100 Monte Carlo runs of this comparison is shown in Fig. 2. As it was expected, the recovery failed when ICA separation
712
IEEE SIGNAL PROCESSING LETTERS, VOL. 20, NO. 7, JULY 2013
Fig. 3. Third experiment. We show the excess of the cost function over the , for the NR (18) and the proposed (17) local minimum , , and . step sizes, and three different number of sources:
Fig. 2. Second experiment. Results for deflationary BCA (the proposed method), ThinICA, JADE, and FastICA (with pow3 non-linearity) with a noisy mixture of dependent sources. The upper figure shows the Amari Index (AI) versus the SNR. The lower figure shows the absolute value of the coefficients of the global transfer matrix, , plotted for each method in a situation. The permutation and the scale were corrected so that the successful is close to the identity. extraction occurs when the matrix
algorithms were used, since the sources were dependent. The proposed BCA algorithm was the only one able to recover the original signals, as long as the system satisfies the conditions seen in Section II. C. The Effect of the Step Size in the Number of Iterations The third experiment illustrates the increment of the speed of convergence when using the proposed step size in the extraction of one source. We compared this step size with the more conservative Newton-Raphson (NR) step size (which is the result of taking only the first order approximation in (16)), (18) A linear and instantaneous mixture of QPSK bounded sources was considered. In the absence of noise, the proposed method was able to converge to the local minimum, thus extracting one of the bounded sources.The number of sources varied, taking the values 6, 20, and 50. The length of the sequences was , 10000, and 20000, respectively. In the Fig. 3, the value of the excess of the cost function for the NR and the proposed step sizes is shown. As the proposed step size uses more information concerning the local shape of the cost function, it usually allows a faster convergence (less number of iterations needed to reach the minimum), which is more noticeable with a large number of sources. VII. CONCLUSION We have presented a BSS method for the recovery of bounded sources from a linear mixture in a noisy scenario, even if they are somehow dependent. It is based on the sequential extraction of all the original sources by following a deflationary scheme. Both the estimation of each bounded source and the deflation of
its contribution are done by means of a BCA-based algorithm, without using second-order statistics. In order to speed up the convergence of this extraction algorithm we proposed a step size which exploits the the local shape of the cost function, showing a fast descent in the sense of a reduced number of iterations. Computer simulations corroborate the ability of the proposed algorithm to blindly recover all the bounded sources even if they are dependent. REFERENCES [1] P. Comon and C. Jutten, Handbook of Blind Source Separation, 1st ed. New York, NY, USA: Academic, 2010. [2] A. T. Erdogan, “Convergence analysis for a class of source separation methods,” in XXIII Int. Symp. Inform., Communication and Automation Technologies (ICAT), Oct. 2011. [3] N. Delfosse and P. Loubaton, “Adaptive blind separation of independent sources: A deflation approach,” Signal Process., vol. 45, pp. 59–83, 1995. [4] J. K. Tugnait, “Identification and deconvolution of multichannel linear non-Gaussian processes using higher order statistics and inverse filter criteria,” IEEE Trans. Signal Process., vol. 45, pp. 658–672, 1999. [5] C. B. Papadias, “Globally convergent blind source separation based on a multiuser kurtosis maximization criterion,” IEEE Trans. Signal Process., vol. 48, pp. 3508–3519, 2000. [6] A. T. Erdogan, “Globally convergent deflationary instantaneous blind source separation algorithm for digital communications signals,” IEEE Trans. Signal Process., vol. 55, no. 5, pp. 2182–2192, May 2007. [7] S. Cruces, “Bounded component analysis of linear mixtures: A criterion of minimum convex perimeter,” IEEE Trans. Signal Process., vol. 58, no. 4, pp. 2141–2154, Apr. 2010. [8] D. T. Pham, “Blind separation of instantaneous mixture of sources based on order statistics,” IEEE Trans. Signal Process., vol. 48, no. 2, pp. 363–375, 2000. [9] S. Javidi, D. P. Mandic, and A. Cichocki, “Complex blind source extraction from noisy mixtures using second order statistics,” IEEE Trans. Circuits Syst. I, vol. 57, no. 7, pp. 1404–1416, 2010. [10] S. Cruces, I. Durán-Díaz, A. Sarmiento, and P. Aguilera-Bonet, “Bounded component analysis of linear mixtures,” in IEEE Int. Conf. on Acoust. Speech and Signal Processing (ICASSP), Dallas, TX, USA, Mar. 2010. [11] S. Cruces, BCA Extraction Algorithm Implemented in MatLab 2009 [Online]. Available: http://personal.us.es/sergio/alg/BCA.html [12] D. P. Mandic and S. L. Goh, Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models. Hoboken, NJ, USA: Wiley, 2009. [13] P. J. Schreier and L. L. Scharf, Statistical Signal Processing of Complex-Valued Data. Cambridge, U.K.: Cambridge Univ. Press, 2010. [14] S. Cruces, A. Cichocki, and S.-I. Amari, “From blind signal extraction to blind instantaneous signal separation: Criteria, algorithms and stability,” IEEE Trans. Neural Netw., vol. 15, no. 4, 2004. [15] S.-I. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind source separation,” Adv. Neural Inform. Process. Syst., vol. 8, 1996.