Modified Singular Value Decomposition by Means of Independent ...

Report 3 Downloads 55 Views
Modified Singular Value Decomposition by Means of Independent Component Analysis V. D. Vrabie

a,b,1

J. I. Mars a J.-L. Lacoume a

a Laboratoire

des Images et des Signaux, ENSIEG - INPG, BP 46, 38402 Saint Martin d’H`eres Cedex, France

b Laboratorul

de Analiza si Prelucrarea Imaginilor, UPB - Bucuresti, Romania

Abstract In multisensor signal processing (underwater acoustics, geophysics, etc.), the initial dataset is usually separated into complementary subspaces called signal and noise subspaces in order to enhance the signal to noise ratio. The Singular Value Decomposition (SVD) is a useful tool to achieve this separation. It provides two orthogonal matrices that convey information on normalized wavelets and propagation vectors. As signal and noise subspaces are on the whole well evaluated, usually the SVD procedure cannot correctly extract only the source waves with a high degree of sensor to sensor correlation. This is due to the constraint given by the orthogonality of the propagation vectors. To relax this condition, exploiting the concept of Independent Component Analysis (ICA), we propose another orthogonal matrix made up of statistically independent normalized wavelets. By using this combined SVD-ICA procedure, we obtain a better separation of these source waves in the signal subspace. Efficiency of this new separation procedure is shown on synthetic and real datasets. Key words: Multisensor Signal Processing, Singular Value Decomposition, Eigenimages, Subspace Separation, Independent Component Analysis.

1

Introduction

The Singular Value Decomposition (SVD) is a powerful decomposition in linear algebra. This procedure is also referred in signal processing as Karhunen Email addresses: [email protected] (V. D. Vrabie ), [email protected] (J. I. Mars), [email protected] (J.-L. Lacoume). 1 Corresponding author. Tel.: +33-4-76-82-64-57; fax: +33-4-76-82-63-84

Preprint submitted to Elsevier Science

19 November 2003

Lo`eve Transformation [8] or Principal Component Analysis [12]. The basic approach to compute the SVD is given in [6]. SVD is used in multisensor signal processing to separate the initial dataset into signal and noise subspaces [5,7,12–14]. It provides two orthogonal matrices that convey information on normalized wavelets and propagation vectors. In some applications [4], the SVD is also used to decompose the signal subspace in two subsets: • The first subset contains what the authors have called sources with a “high sensor to sensor correlation”. • The second subset is made up of sources with a “low sensor to sensor correlation”. For example, in Vertical Seismic Profiling (VSP) in geophysics, this tool is used to define the low-pass, the band-pass and the high-pass subspaces respectively, in terms of range of singular values. This allows to separate the downgoing wavefield, upgoing wavefield and the noise. We will show that the low-pass subspace usually cannot give a correct description of the source waves with a high degree of sensor to sensor correlation and, generally, will not extract these waves only. In order to overcome this problem, this paper proposes to associate the SVD procedure with the Independent Component Analysis (ICA). Hence, a new right matrix is created and made up of fourth order independent normalized wavelets. This makes it possible to relax the non physically justified orthogonality condition of the propagation vectors, improving the low-pass subspace estimate. The content of this paper is as follows: section 2 introduces the model of signals recorded on multisensor array. Section 3 is a brief presentation of the SVD and the subspace separation concept. Section 4 refers to the ICA notion and its use for the separation of the signal subspace in two parts. Applications of this new separation tool on synthetic and real datasets are shown in section 5 before the conclusions.

2

Model of multisensor array signals

2.1 General model In multisensor array processing, the propagation of waves is described by the convolutive model [10]: xi (t) =

P Z X

aip (t − τ )sp (τ )dτ + bi (t)

p=1

2

(1)

The received signal xi (t) on the ith sensor, with i = 1, .., Nc and Nc the number of sensors, results from the superposition of P source waves sp (t) via the transfer functions aip (t). bi (t) are unknown noises, supposed to be Gaussian, spatially white, centered and independent of the source waves. In discrete time, the model can be written as: xi (n) =

P X ∞ X

aip (n − l)sp (l) + bi (n)

(2)

p=1 l=−∞

2.2 Simplified Model In a great number of applications (underwater acoustics, seismic prospection, etc.) the propagation of waves causes only a delay and an attenuation. In these situations each transfer function aip (n) is reduced to an attenuation aip and a time delay nip , so that the model (2) becomes: xi (n) =

P X

aip sp (n − nip ) + bi (n)

(3)

p=1

2.3 Preprocessing The aim of the preprocessing is to obtain the smallest possible signal subspace. To achieve that, we apply a time correction (i.e. waves alignment) for each recorded signal, that compensate the sensor to sensor delay of the dominant component [5,7]. In VSP for example, the downgoing waves are the dominant waves and the sensor to sensor delay can be estimated by cross-correlation. Denoting s1 (n) the aligned wave, the model (3) becomes after alignment: yi (n) = ai1 s1 (n) +

P X

0

0

aip sp (n − nip ) + bi (n)

(4)

p=2 0

0

where yi (n) = xi (n + ni1 ), nip = nip − ni1 and the noise bi (n) = bi (n + ni1 ). From now on, we will consider the simplified model (4) assuming that the aligned source wave s1 (n) is independent from the others and therefore inde0 pendent from sp (n − nip ) [11]. On the assumption that Nt time samples are available, we rewrite the received signals in a data matrix Y ∈ RNc ×Nt : Y = {yin = yi (n) | 1 ≤ i ≤ Nc , 1 ≤ n ≤ Nt }

3

(5)

3

Singular Value Decomposition

The SVD of the data matrix Y is [6,9]: Y = U∆VT =

N X

λk uk vkT

(6)

k=1 Nc ×Nc

where U ∈ R is an orthogonal matrix made up of left singular vectors Nc uk ∈ R (eigenvectors of YYT ), V ∈ RNt ×Nt is an orthogonal matrix made up of right singular vectors vk ∈ RNc (eigenvectors of YT Y), and ∆ ∈ RNc ×Nt is a pseudo-diagonal matrix ∆ = diag(λ1 , . . . ,λk , . . . , λN ) made up of singular values λk , with elements ordered λ1 ≥ . . . ≥ λN ≥ 0 and N = min(Nc , Nt ). The product uk vkT is a rank one Nc × Nt matrix called the k th eigenimage of data matrix Y [4]. Due to the orthogonality of the singular vectors, the eigenimages form an orthogonal basis for the representation of Y. Let uk = [uk1 , . . . , uki , . . . , ukNc ]T and vk = [vk1 , . . . , vkj , . . . , vkNt ]T be the left and right k th singular vectors. In the k th eigenimage, the sample data yijk at time j on sensor i is expressed as yijk = uki vkj . Thus: • vk is called “normalized wavelet” since vkj (1 ≤ j ≤ Nt ) gives the time dependence of the component associated to the k th eigenimage. • uk is called “propagation vector” since uki (1 ≤ i ≤ Nc ) gives the amplitude of the normalized wavelet received on the ith sensor. In the noise-free case, if the Nc recorded signals are linearly independent, the matrix Y is full rank and its perfect reconstruction requires all eigenimages. If the recorded signals are linearly dependent (i.e. they are equal to within a scale factor) the matrix Y is of rank one and its perfect reconstruction requires only the first eigenimage.

3.1 Subspace Separation Let M be the rank of Y in the noise-free case and N in the real case (i.e. when the recorded signals are corrupted by noise). Based on the assumption that the noise is spatially white and decorrelated from the source waves, the SVD is used directly on the raw data Y to achieve a separation between signal and noise subspaces [4,5,7,12–14]: Y = Ysig + Ynois =

M X

λk uk vkT +

k=1

N X

λk uk vkT

(7)

k=M +1

The first M eigenimages of data matrix represents the signal subspace Ysig and the other N − M the noise subspace Ynois . In practice, the choice of M 4

depends on the relative magnitudes of singular values. In reference [4], the authors decompose the signal subspace in two parts. In this partition, the data matrix Y is projected on three orthogonal subspaces: Y = YLP + YBP + YHP

(8)

• The low-pass subspace YLP , associated to the higher Q singular values, contains source waves with a “high degree of sensor to sensor correlation”. • The band-pass subspace YBP , associated to residual M − Q singular values in the signal subspace, is constructed by events with a “little degree of sensor to sensor correlation”. • The high-pass subspace YHP is the noise subspace Ynois . For example, in geophysical interpretation of Vertical Seismic Profiling (VSP), after synchronizing the downgoing waves, YLP contains the downgoing wavefield, YBP the upgoing wavefield and the high-pass subspace YHP the noise components. In crosswell 2 applications, YLP describes high amplitude correlated events (direct waves, refracted waves or tube waves), YBP low amplitude correlated events, as reflections and converted modes and YHP represents the uncorrelated noise. The purpose of this kind of decomposition is to isolate only the highest sensor to sensor correlated waves in the low-pass subspace YLP , like the aligned wave s1 (n) defined in Eq. (4). This is possible if s1 (n) is decorrelated from the other source waves and if there is a large gap between the first Q and the other M − Q singular values describing the signal subspace. It was shown in [5] that Q = 1 for a perfect wave alignment and Q = 2 for a non-perfect wave alignment (for example a phase rotation of s1 (n) along the array due to dispersive phenomena). In this case, SVD provides a good separation of aligned waves in the low-pass subspace. If there is no large gap between the first Q singular values and the following ones, YLP will not extract entirely the aligned waves, and will sometimes contain high energy non-aligned waves as well [14]. Hence, more than 2 singular values are usually needed for a complete extraction of the aligned waves. Let R be the number of singular values necessary to describe the subspace containing the aligned waves (denoted with YR ). Note that Q ≤ R ≤ M and the subspace YR contains also non-aligned waves. Let UR = [u1 , . . . , uR ] ∈ RNc ×R and VR = [v1 , . . . , vR ] ∈ RNt ×R be the matrices made up by the first R left and the right singular vectors, respectively. Let ∆R be the diagonal matrix having on its diagonal the first R singular values. The normalized wavelets vk in VR describing the subspace YR are orthogonal and therefore second order statistically independent. The propagation vectors uk in UR are also orthogonal by construction. There is no physical reason for which the propagation vectors should be orthogonal. In this case, imposing the criterion of orthogonality for the vectors uk , the normalized wavelets are 2

measurement between two wells

5

forced to be a mixture of source waves [14]. One way of relaxing this constraint is to find another matrix of normalized f ∈ RNt ×R for which these waves are “as indepene 1, . . . , v eR] = V wavelets [v R dent as possible”. Based on the assumption that s1 (n) is independent from the other source waves, this can be achieved by the Independent Component Analysis (ICA).

4

SVD and Independent Component Analysis

ICA is a blind decomposition of a multichannel dataset made of an unknown linear mixture of unknown source signals, based on the assumption that the source signals are mutually statistically independent [1–3]. This means that the cross-cumulants of any order must be equal to zero. As usual, the third order cumulants are discarded because they are generally null or close to zero, and we will use fourth order statistics which have been found to be sufficient for instantaneous mixtures [1,2]. ICA can be solved by a two-stage algorithm, consisting of a prewhitening and a high-order step. The first step is carried out by the SVD directly on the raw data. The normalized wavelets in the matrix VR ∈ RNt ×R , second order statistically independent, are an instantaneous linear mixture of source waves if the non-aligned source waves in YR are contained in a subspace of dimension smallest than R − 1 [14]. Assuming that and supposing the source waves mutually statistically independent, the second step consists in finding a rotation f = V B are (orthogonal) matrix B ∈ RR×R for which the components of V R R independent at the fourth order. There are different methods of finding the rotation matrix B: Joint Approximate Diagonalization of Eigenmatrices (JADE) [1], Maximal Diagonality (MD) [2] or Simultaneous Third-Order Tensor Diagonalization (STOTD) [3]. In this paper we used the JADE algorithm, although we believe that similar results are obtained employing other algorithms. T fT , the subspace Y becomes: Since VR = BV R R YR =

R X

T eT λk uk vkT = UR ∆R VR = CR V R

(9)

k=1

where CR = [c1 , . . . , cR ] ∈ RNc ×R is CR = UR ∆R B. From this matrix, we obtain two matrices 3 : f = [u e 1, . . . , u e R ] ∈ RNc ×R made up of normalized • a rectangular matrix U R e k = ck / kck k, which are generally columns (the new propagation vectors) u not orthogonal. 3

kck k is the `2 -norm of the vector ck .

6

f = diag (β , . . . , β ) ∈ RR×R with diagonal elements • a diagonal matrix D R 1 R βk = kck k called “modified singular values”.

The elements βk are usually not ordered. For this reason, we perform a perf as well as between the columns of V f mutation P between the columns of U R h iR f . Using the notations U f e P(1) , . . . , u e P(R) , to order the inputs of D R P(R) = u h

i

³

´

f f e P(1) , . . . , v e P(R) and D V P(R) = v P(R) = diag βP(1) , . . . , βP(R) , with the diagonal elements ordered, i.e. βP(1) ≥ · · · ≥ βP(R) ≥ 0, the decomposition (9) becomes:

e P(R) D e P(R) V eT YR = U P(R) =

R X

T e P(k) v eP(k) βP(k) u

(10)

k=1

With Q1 the number of normalized wavelets necessary to describe the high sensor to sensor correlated waves (depending on the relative magnitudes of the f f elements in the diagonal matrix D P(R) ), the low-pass YLP and the band-pass f Y BP subspaces using SVD-ICA are given by: e LP = Y

Q1 X

T e P(k) v eP(k) βP(k) u

(11)

k=1

e BP = Y

R X

T e P(k) v eP(k) βP(k) u +

M X

λk uk vkT

(12)

k=R+1

k=Q1 +1

There is no difference between the subspaces YR obtained with the classical SVD and with SVD-ICA, but there are differences between the low-pass and e P(k) , these subspaces are band-pass subspaces. Due to the orthogonality of v orthogonal as well. Furthermore, in the decomposition (10) a stronger criterion e P(k) has been imposed, i.e. to be statistifor the new normalized wavelets v cally independent at the fourth order, and, at the same time, the condition of e P(k) has been relaxed. orthogonality for the new propagation vectors u

5

Applications

5.1 Synthetic data In this section we present an application on a synthetic dataset. The preprocessed recorded signals Y on 8 sensors array (Nc = 8) are represented in Fig. 1. In this case we have P = 3 source waves: the waves wA and wC with infinite apparent propagation velocity on the array (arriving at the same time on each sensor, representing the aligned waves) and the wave wB , with a negative apparent propagation velocity on the array, which is present between sensor 1 and 4. On each wave we have applied an amplitude reduction and a phase 7

y1 (t)

y8 (t) Fig. 1. Synthetic dataset Y.

rotation along the array in order to simulate the absorption phenomena. The waves with a high degree of sensor to sensor correlation are the waves wA and wC . Spatially white Gaussian noise was added to these source waves. The signal-to-noise ratio 4 is SN R = 30dB and the number of time samples is Nt = 256. The percent relative magnitudes 100 · λk /

N P

λk of the singular values given by

k=1

SVD are presented in Fig. 2. The number of singular values used to describe

Fig. 2. Relative magnitudes in ∆.

the signal subspace is M = 6. Among those 6 singular values, the first two are related to the highest correlated waves (Q = 2). Fig. 3 shows the first 6 normalized wavelets vk . As we can see, these wavelets are an instantaneous linear mixture of the source waves.

Fig. 3. Normalized wavelets in VR .

The low-pass subspace YLP obtained with the first Q = 2 eigenimages is presented in Fig. 4 and the band-pass subspace YBP obtained with the following 4 eigenimages in Fig. 5. It is clear from these figures that the classical SVD implies some artefacts in the low-pass and band-pass subspaces for a wavefield kY

4

k

sig The SN R definition used here is SN R = 20 log10 kYnois k , where k.k is the matrix Frobenius norm, Ysig the noise-free dataset and Ynois the additive noise.

8

Fig. 4. Low-pass subspace YLP .

Fig. 5. Band-pass subspace YBP .

separation objective. Fig. 6 shows the relative magnitudes of the elements in the diagonal maf trix D P(R) obtained using the SVD-ICA on the subspace YR (in this case, R = M = 6) and also the last singular values given by SVD, related to the noise subspace, i.e. λ7 and λ8 . The number of components that describes the high correlated waves is Q1 = 2. With SVD-ICA the repartition of the modified singular values in 3 clusters is more evident than with classical SVD. f Furthermore, the normalized wavelets in V P(R) presented in Fig. 7 are clearly

e P(R) . Fig. 6. Relative magnitudes in D

closer to the original waves than the normalized wavelets given by SVD (Fig. 3). f f The low-pass subspace Y LP is given in Fig. 8 and the band-pass subspace YBP in Fig. 9. The low-pass subspace extracts the first two highly correlated waves without any visible interference with the other waves. The residual artefacts presented in classical SVD (Fig. 4) are eliminated. This improvement is due to the fact that by ICA we have imposed a fourth order independence condition stronger than the decorrelation used in classical SVD. With this method of decomposition we have relaxed the non physically justified orthogonality of 9

e P(R) . Fig. 7. Normalized wavelets in V

the propagation vectors.

e LP . Fig. 8. Low-pass subspace Y

e BP . Fig. 9. Band-pass subspace Y

For a quantitative evaluation of the errors, we use the mean-square error between the ideal low-pass subspace YLP and the reconstructed ones: M SE =

° 1 ° °YLP − YLP °2 Nc Nt

(13)

where k.k is the matrix Frobenius norm and YLP represents either the estimated low-pass subspace YLP obtained using the SVD or the estimated f low-pass subspace Y LP obtained using the SVD-ICA. The gain factor (in dB) between the two methods is given by: µ GM SE = 10 log10

M SE|SVD M SE|SVD-ICA

¶ (14)

In this example we have M SE|SVD = 15.7 ∗ 10−3 for classical SVD and M SE|SVD-ICA = 5.2 ∗ 10−3 for SVD-ICA, giving a gain factor GM SE = 4.8 dB. 10

The gain factor for different signal-to-noise ratio SN R (in dB) of the dataset is presented in table (15). SN R

50

40

35

30

20

10

GM SE

8.5 7.9 6.9 4.8

3

1.3

(15)

5.2 Real data We present now an application on a real dataset obtained during an Ocean Bottom Seismic (OBS) acquisition. The recorded signals Y on Nc = 40 sensors array and Nt = 480 time samples, after the preprocessing, are presented in Fig. 10. In this case, the signal subspace Ysig is approximatively described

Fig. 10. Real dataset Y.

by the first M = 35 singular values. Fig. 11 shows the first 9 normalized wavelets vk in V. As we can see, the aligned waves are projected on the first R = 7 normalized wavelets. These wavelets, used to describe the subspace YR , are an instantaneous linear mixture of the source waves. With SVD-ICA,

Fig. 11. The first 9 normalized wavelets in V. f e P(k) in V the normalized wavelets v P(R) presented in Fig. 12 are clearly better separated than in the classical SVD.

11

Fig. 13 presents the relative magnitudes of the elements in the diagonal ma-

e P(R) . Fig. 12. Normalized wavelets in V f trices ∆R and D P(R) . A better distribution of these elements is obtained using SVD-ICA than using classical SVD.

e P(R) . Fig. 13. Relative magnitudes in ∆R and D

Fig. 14 shows the residual subspace (i.e. the difference between the initial data Y and the low-pass subspace YLP ) obtained using SVD and keeping the first Q = 2 eigenimages. In this case, the aligned waves are not comf pletely eliminated. Fig. 15 presents the residual subspace Y − Y LP obtained

Fig. 14. Residual subspace Y − YLP .

using SVD-ICA and Q1 = 2. Here, the low-pass subspace extracts the highly correlated waves without any visible interference with the other waves. 12

e LP . Fig. 15. Residual subspace Y − Y

6

Conclusions

We have presented a new method for multidimensional signal space separation into a low-pass and a band-pass subspace, respectively. The classical SVD imposes the orthogonality for the propagation vectors and forces the normalized wavelets in the right matrix to be a mixture of source waves. This procedure introduces artefacts in the two estimated subspaces. The results were improved and the number of errors in the two subspaces was decreased by using the ICA on the first R normalized wavelets in the right orthogonal matrix. The choice of the number of singular values Q and M for the classical SVD, as well as Q1 and R for SVD-ICA depends on the data themselves. The presented examples have shown that the clustering of modified singular values is more effective with SVD-ICA.

References [1] J.-F. Cardoso, A. Souloumiac, “Blind Beamforming for Non-Gaussian Signals”, IEE Proc.-F, Vol. 140, No. 6, 1993, pp. 362-370. [2] P. Comon, “Independent Component Analysis, A new Concept?”, Signal Processing, Special Issue on HOS, Vol. 36, No. 3, April 1994, pp. 287-314. [3] L. De Lathauwer, “Signal Processing Based on Multilinear Algebra”, PhD Thesis, Katholieke Universiteit Leuven, September 1997. [4] S.L.M. Freire, T.J. Ulrych, “Application of Singular Value Decomposition to Vertical Seismic Profiling”, Geophysics, Vol. 53, June 1988, pp. 778-785. [5] F. Glangeaud, J.L. Mari, F. Coppens, “Signal Processing for Geologists and Geophysicists”, Technip Ed., 1999.

13

[6] G.H. Golub, C.F. Van Loan, “Matrix Computations”, The John Hopkins University Press, Baltimore and London, 1989. [7] B. Hardage, “Seismic Application Series”, Vol. I, Crosswell Seismology & Reverse VSP, Geophysical Press Ltd, 1992. [8] C. H´emon, D. Mac´e, “Essai d’une application de la transformation KarhunenLo`eve au traitement sismique”, Geophysical Prospecting, Vol. 26, No. 4, 1978, pp. 600-626. [9] V.C. Klema, A.J. Laub, “The Singular Value Decomposition : its Computation and Some Applications”, IEEE. Trans on Auto. Control, AC. 25, No. 2, 1980, pp. 164-176. [10] J.-L. Lacoume, P.O. Amblard, P. Comon, “Statistiques d’ordre sup´erieur pour le traitement du signal”, Masson, 1997. [11] H.-L. Nguyen Thi, C. Jutten, “Blind Source Separation for Convolutive Mixtures”, Signal Processing, vol 45, pp. 209-229, 1995. [12] L.L. Scharf, “Statistical Signal Processing: Detection, Estimation, and Time Series Analysis”, Addison-Wesley, New York, 1991. [13] B. Ursin, Y. Zheng, “Identification of Seismic Reflections Using Singular Value Decomposition”, Geophysical Prospecting, No. 33, 1985, pp. 773-799. [14] V. D. Vrabie, “Statistiques d’ordre sup´erieur : applications en g´eophysique et ´electrotechnique”, PhD Thesis, I.N.P. of Grenoble, France, October 2003.

14