A Simulated Annealing Version of the EM Algorithm for non-Gaussian deconvolution M. Lavielle
E. Moulines
y
January 29, 1997
Abstract The Expectation-Maximization (EM) algorithm is a very popular technique for maximum likelihood estimation in incomplete data models. When the expectation step cannot be performed in closed{form, a stochastic approximation of EM (SAEM) can be used. Under very general conditions, the authors have shown that the attractive stationary points of the SAEM algorithm correspond to the global and local maxima of the observed likelihood. In order to avoid convergence towards a local maxima, a simulated annealing version of SAEM is proposed. An illustrative application to the convolution model for estimating the coecients of the lter is given.
UFR de Mathematiques et Informatique, Universite Paris V and CNRS URA 743, Universite Paris-Sud,
[email protected] y T elecom Paris/URA 820, 46, rue Barrault, 75634 Paris CEDEX 13
1
1 Introduction An important preliminary step in all linear deconvolution problems is the estimation of the lter coecients, also referred to as the wavelet in seismic applications. Most of the deconvolution techniques known to date assume that the wavelet is minimum phase and estimate the transfer function of the lter from the second-order statistics of the observed data. Unfortunately, the minimum phase assumption is not always ful lled and the system phase should also be recovered (it is well-known that the phase information cannot be recovered from second-order moments). The estimation of non minimum phase system has received a considerable attention in the past decade (see, for example, Giannakis and Mendel, 1989; see also Nikias and Mendel, 1993, and the many references therein). For non-minimum phase channels, most of the methods proposed are based on the higher-order statistics (the third-order or the fourth-order cumulants or the corresponding frequency-domain quantities, e.g. the bispectrum or the trispectrum) of the received data. These methods are most often easy to implement but are generally computationally intensive and are far from being optimal from a statistical point of view when an a priori information is available on the distribution of the input signal. This is for example the case in seismic deconvolution or in digital communication applications, for which the distribution of the input signal is either known or at least can be modelled accurately. In these situations, important improvements in the performance of the estimates can be achieved by incorporating this information in the estimation procedure. In this contribution, we derive a maximum likelihood solution to this problem. In many situations that involve some non observed data, the EM algorithm is a useful tool for maximizing the likelihood of the observations (Dempster et al., 1977). Nevertheless, this algorithm requires at each iteration the computation of a conditional expectation and this step cannot be done always done in a closed-form. In such situations, a stochastic version of EM, the so-called SAEM algorithm proposed by Lavielle and Moulines (1995), can be used. SAEM requires to simulate the non observed data under the posterior distribution at each iteration. In the case of a convolution model, this simulation is achieved by using a MCMC algorithm, such as the Gibbs Sampler or a Hastings Metropolis procedure. It is well known that when EM converges, it converges to a stationary point of the likelihood, that is, any point where the gradient of the observed likelihood is zero, (see Wu, 1983, Arslan et al., 1993). On the other hand, under some general hypothesis, SAEM algorithm converges towards any local maximum of the observed likelihood. Now, to ensure convergence only towards a global maximum, a simulated annealing version of this procedure should be used. We propose here a version of this algorithm for the identi cation of a convolution model. Some maxima of the likelihood are made up 2
of several lters having similar transfer functions, up to the phase. When the input signals are not Gaussian, the observed likelihood function does not remain constant by a change of phase. Then it should be possible to recover the correct phase by maximizing the likelihood. Some simulations and an application to seismic deconvolution are presented to illustrate the performance of the proposed method.
2 Incomplete data problem and the EM algorithm 2.1 The deconvolution as an incomplete data problem Blind deconvolution can naturally be casted into the general framework of incomplete data problems. Denote y = (y (1); : : :; y (n))t the vector of observed data samples. The data samples y is the realization of a random vector Y = (Y (1); : : :; Y (n))t which can be modelled as the linear convolution of an unobserved input sequence Z = (Z (1); : : :; Z (n))t in additive noise ", e.g.,
Y (t) =
L X l=0
a(l)Z (t ? l) + "(t)
(1)
where a = (a(0); : : :; a(L))t is the (unknown) vector of the lter (wavelet) coecients and 2 is the noise variance. For simplicity, we assume Z (t) = 0 for t 0. Let A, be the Fourier transform of a:
A() = a(0) + a(1)e?i + a(2)e?2i + + a(L)e?Li (2) = a(L)(e?i ? r(1))(e?i ? r(2)) (e?i ? r(L)) (3) = jA()jei(); 2 IR: (4) Here, A is called the transfer function of the lter a and its phase. The lter a is completely de ned by its impulse response a(0); : : :; a(L), that is, by the roots r(1); r(2); : : :; r(L) and the scale factor a(L). Now, let a and a~ be two lters such that jA()j = jA~()j, but that dier from their phases (() 6= ~()). Then, it is well known that there exist some indexes j1 ; : : :; jm such that (~r(j1); r~(j2); ; r~(jm )) = (r(j1); r(j2); : : :; r(jm)) (~r(jm+1 ); r~(jm+2 ); : : :; r~(jL)) = ( r(j 1 ) ; r(j 1 ) ; : : :; r(j1 ) ): m+1 m+2 L Then, the phase of a lter is directly related with the position of the roots of its transfer function. In particular, a lter is said to be minimum phase if the roots r(1); : : :; r(L) are all outside of the unit circle (i.e. jr(j )j > 1; 1 j L). 3
We shall make the following hypotheses, concerning the random sequences Z and ": (i) fZ (t)gt1 is a sequence of independent and identically distributed random variables distributed according to some known non-Gaussian distribution function (extension to the case where the distribution of Z (t) is unknown but belongs to some parametric family is straightforward) (ii) f"(t)g is a sequence of independent standardized Gaussian variables, (iii) Z (t) and "(t0 ) are independent for all pairs of time indexes 1 t; t0 n. This assumption together with the equation (1) speci es completely the loglikelihood of the observed data samples. Let Z be the n (L + 1) matrix such that Z (i; j ) = z (i ? j ). Then, the complete log-likelihood is, up to a constant, log f (y; z ; ) = ? n2 log 2 ? 21 2 ky ? Z ak2 + log (z): (5) The likelihood of the observed data can then be obtained by computing the marginal distribution of Y , that is by integrating over the values of the unobserved input data sequence
g(y; ) =
Z1
?1
:::
Z +1
?1
f (y; z; )Z (dz(1)) : : :Z (dz(n)):
(6)
The likelihood of the observed data, because of the integration, is very intricate: it cannot even be expressed in closed form except in a very few simple cases (e.g., discrete distributions or, of course, Gaussian distribution). On the contrary, the structure of f (y ; z; ) is much simpler. This is a well known situation where the likelihood of the complete-data vector X = (Y ; Z ) constructed by extending the observations with some unobserved (or latent) values is related to the unknown parameters much more naturally than the measurements are. These kinds of problems can be tackled by resorting to the EM algorithm, which exploits the duality between the complete data and the incomplete data problems (see Demster et al., 1977, Tanner, 1993).
2.2 The EM and MCEM algorithms The EM algorithm is an iterative method for nding maximum likelihood estimates, which has several appealing properties relative to other iterative algorithms such as Newton-Raphson. First, it is often easily implemented because it relies on complete-data computations: the E-step of each iteration only involves taking expectations over complete-data conditional distributions while the M-step requires complete data maximum likelihood estimation, which is often in simple closed form. Secondly, it is numerically stable: each iteration increases the complete data likelihood and the convergence is almost always to a local maximum. The EM algorithm maximizes the observed likelihood g (y; ) by iteratively maximizing conditional expectations of log f (y ; Z ; ). Each iteration of EM algorithm is decomposed in two steps: an E-step 4
and an M-step. At step k + 1, the E-step consists in evaluating the conditional expectation of the complete data likelihood given Y = y and the current t for the parameter vector k :
Q(jk ) = E (log f ((y; Z ); )jY = y; k ):
(7)
The M-step consists in evaluating k+1 that maximizes in the feasible set the function ! Q(jk ). This procedure is repeated until convergence becomes apparent. For the convolution model, it is easily seen that, from (5), S~(y; z) = (s1; s2 ; s3) de ned by
s1 = y t y ; s2 = Z 0y ; s3 = Z 0Z :
(8)
are the sucient statistics for the complete likelihood. In fact, the complete log-likelihood is, up to a constant, log f (y ; z; ) = ? n2 log 2 ? 21 2 (s1 ? 2at s2 + at s3 a) + log (z ) (9) (this implies that any estimator based on the complete data model should be a function of S~(y; z )). The expectation step comes down to compute the conditional expectation of the sucient statistics given the current value of the parameter and of the observed data:
s2;k = E (Z tyjY = y; k ) s3;k = E (Z tZjY = y; k )
(10)
The maximization step writes = k2+1 = k+1 =
ak+1
(s3;k )?1 s2;k 1 s ? (s )t(s )?1 s n 1 2;k 3;k 2;k (ak+1 ; k2+1 )
(11)
Some serious diculties arise when trying to apply the EM algorithm in the lter estimation problem. First, for most input signal densities the expectation step cannot be done in closed-form because the conditional density of the input sequence given the data is very complicated (in particular, the conditional density k(z jy; ) does not factorize as the product of the individual posterior densities k(z(t)jy; )). A notable exception is when the input signal Z (t) takes only a nite number of values, as in digital communication applications. In this case, the lter estimation comes down to an hidden Markov model estimation problem, and the conditional expectation of the complete data log-likelihood can be performed by resorting to the Baum-Welch forward-backward procedure (see, for example, Kaleh and Vallet, 1994 for details). 5
Second, it appears that the incomplete likelihood generally has several local maxima; it is not dicult to guess that some of those maxima correspond to dierent determinations of the lter phase. In the deconvolution application, the retrieval of the correct system phase is of utmost importance, since errors in the estimation of the phase will severely impair the estimation of the input signal sequence itself. A stochastic version of the EM algorithm is the Monte-Carlo EM algorithm, initially proposed by Wei and Tanner (1990). In the MCEM algorithm, the expectation in the E-step is carried out using a Monte-Carlo method. Basically, the E-step at step k is replaced by the following procedure:
Simulation-step (S-step): generate m(k) realisations (see below) zk;j (j = 1; ; m(k)) of the missing data vector under k(z jy; k ), the a posteriori density of the missing data given the observations and the current t k of the parameter vector, Monte-Carlo integration: compute the current approximation of the conditional expectation of the complete data log-likelihood according to
mX (k)
Qk () = m1(k) log f (y; zk;j ; ) j =1
(12)
The maximization step remains unchanged. In the convolution model, the M-step comes down to compute a Monte-Carlo approximation of the conditional expectation of the complete data sucient statistics: mX (k)
s2;k = m1(k) (Zk;j )t y j =1
(13)
mX (k)
s3;k = m1(k) (Zk;j )t Zk;j ; j =1 where Zk;j is the n (L + 1) matrix containing the simulated input data samples z k;j . The MCEM algorithm shares most of the attractive features of the EM with the notable exception of the monotone convergence property, i.e. the incomplete likelihood is not necessarily increased at each iteration. That prevents any guarantees about the convergence of the algorithm, except in very simple situations. Furthermore, the number of simulations m(k) is dicult to choose in the practice. Good results are usually obtained with small values of m(k) during the rst iterations, and larger values during the last iterations. Unfortunately, there is no theoretical results for choosing an optimal sequence fm(k)g.
6
3 The SAEM algorithm 3.1 The basic algorithm The SAEM (Stochastic Approximation EM) algorithm proposed by Lavielle and Moulines (1995) makes use of a stochastic approximation procedure for estimating the conditional expectation of the complete data log-likelihood. Similar to the MCEM algorithm, the basic idea consists in splitting the E-step, into a simulation step and an integration step. Similar to the MCEM, the S-step consists to generate realisations of the complete data vector under the a posteriori density; the Monte-Carlo integration is substituted by a stochastic averaging procedure. The proposed algorithm may thus be summarized as follows:
Simulation : generate a realisation z k of the missing data vector under k(zjy; k ), the a posteriori density of the complete data given the observations and the current t k of the parameter vector, Stochastic approximation : compute the current approximation of the conditional expectation Qk according to Qk () = Qk?1 () + k (log f (y; zk ; ) ? Qk?1 ()) (14) where ( k ) is a sequence of positive stepsizes.
Maximization: maximize Qk () in the feasible set , i.e. nd k+1 2 such that: Qk (k+1 ) Qk () 8 2 : The stepsize allows the tuning of the stochastic excitation fed into the algorithm, avoiding the convergence towards spurious stationary points (e.g. saddle points or local minima). One of the interest of SAEM is that all the results obtained for stochastic approximation in a general framework can be used. In particular, almost-sure pointwise convergence of the sequence of parameter estimates to local maxima of the incomplete likelihood is proved for a wide class of exponential models, with an appropriate sequence of stepsizes, (see Lavielle and Moulines, 1995 for the many technical details). Typically, we set k = 1=k to ensure convergence of the sequence fk g. Furthermore, all the techniques developed to speed up convergence of stochastic algorithms can be used, such as the Polyak's (1990) averaging procedure and the Kesten's (1958) procedure for computing an optimal sequence f tg. They are very ecient for reducing automatically the number of iterations of SAEM. 7
3.2 SAEM coupled with MCMC In the convolution model, the missing data (the input signals) are not independent under the a posteriori distribution k(z jy; k ). This joint distribution is complicated and cannot be simulated directly. An iterative procedure of simulation such as a MCMC algorithm is necessary. In the convolution model, the SAEM algorithm coupled with the Gibbs Sampler boils down to
Choose an initial guess (0; z0) At step k,
{ For t = 1; 2; : : :; n, simulate zk (t) with the conditional distribution
k(z(t) j y ; zk (1); : : :; zk (t ? 1); zk?1(t + 1); : : :; zk?1 (n) ; k ). { Update the matrix Zk . { Update the statistics s2 and s3
s2;k = s2;k?1 + k ((Zk )ty ? s2;k?1 ) s3;k = s3;k?1 + k ((Zk )tZk ? s3;k?1 )
(15)
{ Compute k+1 according to (11). Stop when fk g converges. Instead of a deterministic visiting schedule, we can use a random schedule, by choosing t with a uniform distribution on f1; 2; : : :; ng (see Amit and Grenander, 1991 for a comparison of these schedules). In the numerical examples presented below, we visit all the components of z in a random order at iteration k by choosing a random permutation of (1; 2; : : :; n) at each iteration. When the conditional distribution k(z (t)jy ; z (t0 ); t 6= t0 ; k ) cannot be computed easily and simulated, the Hastings-Metropolis procedure can be used. At step k, the simulation of zk (t) is performed as follows:
Choose z~(t) with the prior distribution and set z~(t0) = zk?1 (t0) for t0 6= t. Compute
(zk ; z~ ) = 1 2 ky ? ak z~ k2 ? ky ? ak z k?1 k2 2k
8
(16)
Choose u with a uniform distribution on [0; 1] and set z k = z~ if (z k ; z~ ) < 0 or u e?(z k ;z~ ) z k = z k?1 if (z k ; z~ ) 0 and u > e?(z k ;z~ ) :
(17)
In the original version of SAEM, the convergence of the sequence fk gk0 toward a maximum of the likelihood is ensured if fz k gk0 is a sequence of independent realizations of the posterior distributions fk(zjy; k )gk0. In this application of SAEM, the sequence of simulated series fzk g is a Markov chain created with a MCMC procedure. Nevertheless, we can show that the asymptotically stable points of the SAEM algorithm are still the local maximum of the observed likelihood (Lavielle and Moulines, 1995). Furthermore, most of the general results concerning the convergence of stochastic approximations with independent perturbations extend to Markovian perturbations under some hypotheses (see Benveniste et al., 1990 and Du o, 1996). In particular, the convergence of the sequence fk g can directely be obtained when the random variables fZ (t)g take a nite number of values (precise convergence results in a general context will be considered in details in a forthcoming paper). With the SAEM procedure, convergence towards any local maximum depends on the initial guess. On the other hand, a simulated annealing version of SAEM algorithm should be used to improve convergence only towards a global maximum.
3.3 The simulated annealing version of SAEM Let a1 and a2 be two lters with the same transfer function. When the input signals Z are Gaussian, the observed likelihood function remains constant by a change of phase: we have g (y; a1 ; 2) = g(y; a2; 2). There exist several global maxima that correspond to lters having the same transfer function but dierent phases. In particular, if a1 is a maximum likelihood estimate of a, a2 is also a maximum likelihood estimate of a. That means that there is no hope of recovering the correct phase by maximizing the observed likelihood. When the input signals are not Gaussian, the observed likelihood function does not remain constant by a change of phase. If a1 is a global maximum of the likelihood g , a local maximum of g belongs to a neighborhood of a2 . Here, some maxima of the likelihood are made up of lters having dierent phases and SAEM can converge towards a local maximum, that is, a lter with a wrong phase. A simulated annealing procedure should theoretically ensure convergence towards a global maximum of g . Then, we can hope to recover the correct phase if the number of data is large enough. Equation (15) can be put in the general form of a stochastic algorithm Sk = Sk?1 + k h(Sk?1 ) + k ek 9
h(Sk?1 ) = E (S~(y; Z k )jy; k ) ? Sk?1 ek = S~(y; Z k ) ? E (S~(y; Z k )jy; k ):
(18)
The usual simulated annealing version of this stochastic algorithm consists in adding an additive Gaussian white noise: Sk = Sk?1 + k h(Sk?1 ) + k ek + ck k (19) where fck g is a sequence of positive numbers, referred to as the temperature, which decreases slowly to zero. This kind of algorithm can be used for maximizing a function f on IR d , by setting h = rf (this procedure is called a stochastic gradient algorithm, see Du o (1996) or Gelfand and Mitter (1993)). Under appropriate conditions on the recursion (19) and on the cooling schedule (i.e., on the rate of convergence of ck ), Gelfand and Mitter (1993) have demonstrated that the sequence fSk g converges in distribution towards the global maxima of f . Unfortunately, this procedure cannot be used directly in our application. In fact, the vector of statistics Sk must belong to a certain subset S of IRp . To add an isotropic Gaussian white noise in the equation (15) does not ensure that Sk 2 S . In particular, we can obtain a value of Sk such that, when computing the new value k+1 , the estimated variance is negative: : : We propose to add a add a random perturbation with a covariance structure that ensures that Sk 2 S . For this purpose, we propose to generate the input signals in the MCMC procedure with the posterior distribution kT de ned by:
kT (zjy; ak ; k2) = k(zjy; ak ; (1 + Tk )2k2): In other words, we consider the \false" model at iteration k:
Y (t) =
L X l=0
a(l)Z (t ? l) + (1 + Tk )"(t)
(20)
The sequence fTk g is a positive sequence of temperatures that decreases slowly towards 0. It is clear that the form (19) still holds and that the vector of statistics Sk remains in S , but the distribution of ek has been arti cially modi ed. The role of T is clear: the bigger T is, the less concentrated is kT around its modes. Then, the local maxima of g can be avoided during the rst iterations. The original version of SAEM is obtained with Tk = 0 for any k. Although the analysis of this new algorithm is heuristic, the simulations give very good results. A theoretical analysis is still an open problem.
10
3.4 Numerical experiments To illustrate the performance of the proposed algorithm, we consider a simple example where the input signal is an independent and identically distributed sequence of random variables distributed according to a mixture of two zero-mean Gaussian distributions with dierent variances:
Zi ; N (0; 12) + (1 ? )N (0; 22)
(21)
This model is standard in deconvolution of seismic traces (see Lavielle, 1991, or Mendel, 1983). We set: = 0:25, 12 = 20 and 22 = 1. The variance of the additive noise is adjusted in such a way that the signal to noise ratio is equal to 13 dB (the variance of the signal is 20 times the variance of the noise). The lter a is of length L = 4. It is real, but not symmetric in this example, in order to obtain complex (and conjugate) roots. The observed sequence y is of length n = 1000. We propose in Figure 1 an estimation of the lter a, obtained with the SAEM algorithm. In this example, the initial guess allows a correct estimation of the coecients of the lter (Figure 1{a). Figure 1{c shows the trajectory of the four roots rk = (rk (1); rk(2); rk(3); rk(4)). In the second example (Figure 2) obtained with another initial guess, the four roots converge to erroneous values that are close to the inverse of the true values. We can see in Figure 2{b that the modulus of the estimated transfer function jA^j is closed to jAj. In other words, the estimated lter and the original lter essentially dier from their phase. Using the simulated annealing version of SAEM with the same initial guess, the trajectory of rk now escapes of the domain of attraction of the wrong roots obtained previously (Figure 3). In seismic deconvolution, the observed data is a set of seismic traces and the non observed data series Z is the sequence of re ection coecients of the earth. In a rst approximation, a seismic trace is the convolution of the re ectivity sequence with a wavelet sent from the top of the earth. A random measurement noise is also present. The distribution proposed in (21) is sometimes used to model the re ectivity sequence (see Lavielle, 1991 and Mendel, 1983). In fact, the main re ectors, which indicate separation between the layers have a variance bigger than the secondary re ectors, which indicate small variations inside layers. The wavelet (or lter) estimated from real data is displayed in Figure 4. Conventionally, the positive part is shaded. In order to validate this result, we used this estimated lter for simulating synthetic data, with a simulated sequence Z . Then, we used SAEM to recover it. The original lter and the initialization are displayed in Figure 5{a (the initialization is a spike at t = 5 while the lter is symmetric around t = 10). With this initial guess, SAEM converges to a wrong solution (Figure 5{b). On the other hand, the simulated annealing version of SAEM allows a correct recovery of the system phase (Figure 5{c). It is interesting to remark that these two estimated 11
lters essentially dier from their phase (Figure 5{d). For this example, we used the following cooling schedule: Tk = k T0 where T0 = 4 and = 0:9.
References Amit, Y. and Grenander, U. (1991) Comparing sweep strategies for stochastic relaxation. J. Mult. Analysis 37 197{222. Arslan, O., Constable, P., and Kent, J. (1993) Domains of convergence for the EMalgorithm: a cautionary tale in a location estimation problem. Statistics and Computing 3 103{108. Benveniste, A., Metivier, M., and Priouret, P. (1990) Adaptive Algorithms and Stochastic Approximations. Springer{Verlag. Dempster, A., Laird, N., and Rubin, D. (1977) Maximum-likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B 39 1{38. Du o, M. (1996) Algorithmes Stochastiques. SMAI, Springer. Gelfand, S. and Mitter, S. (1993) Metropolis-type annealing algorithms for global optimization in IRd . SIAM J. Control and optimization 31 1 111{131. Giannakis, G. and Mendel, J. (1989) Identi cation of non-minimum phase systems using higher-order statistics. IEEE Tr. on Acoust. Speech and Sig. Proc. 37 360{377. Kaleh, G. and Vallet, R. (1994) Joint parameters estimation and symbols detection for digital communication over linear or nonlinear unknown channels. IEEE Transaction on Communications 7 2406{ 2413. Kesten, H. (1958) Accelareted stochastic approximation. Ann. Math. Statist. 29 41{59. Lavielle, M. (1991) 2-D Bayesian deconvolution. Geophysics 56 12 2008{2018. Lavielle, M. and Moulines, E. (1995) On a stochastic approximation version of the EM algorithm. tech. rep. Publication Universite Paris-Sud. Mendel, J. (1983) Optimal seismic deconvolution. Academic Press Inc. Nikias, C. and Mendel, J. (1993) Signal processing with higher-order spectra. I.E.E.E Signal Processing Magazine 3 10{37. Polyak, B. (1990) New stochastic approximation type procedures. Automatica i Telemekh. (translated in Autom. & Remote Contr.) 51 98{107. 12
Tanner, M. (1993) Tools for statistical inference: methods for exploration of posterior distributions and likelihood functions. Springer-Verlag: Springer series in statistics. Wei, G. and Tanner, M. (1990) A Monte-Carlo implementation of the EM algorithm and the Poor's Man's data augmentation algorithm. J. Amer. Stat. Assoc. 85 699{704. Wu, C. (1983) On the convergence property of the EM algorithm. The Annals of Stat. 11 95{103.
13
Fig. 1 Estimation of the lter obtained with SAEM, (a) Estimation of the impulse response of the
lter, (b) Estimation of the modulus of the transfer function of the lter, (c) Estimation of the roots of the transfer function. The unit circle and the trajectories of the estimated roots are represented in the complex plan. ? : the original roots, : the initialization, o : the estimation.
Fig. 2 Estimation of the lter obtained with SAEM and another initial guess, (a) Estimation of the impulse response of the lter, (b) Estimation of the modulus of the transfer function of the lter, (c) Estimation of the roots of the transfer function, ? : the original roots, : the initialization, o : the estimation.
Fig. 3 Estimation of the roots of the transfer function, obtained with the simulated annealing version of SAEM, ? : the original roots, : the initialization, o : the estimation.
Fig. 4 Estimation of the wavelet, obtained from a real data set of seismic traces. The positive part
is shaded.
Fig. 5 (a) The original lter of length L = 20 and the initialization (a spike at t = 5), (b) estimation
obtained with SAEM (c) estimation obtained with the simulated annealing version of SAEM, (d) comparison of the modulus of the transfer functions of the lters displayed in (b) (dashed line) and (c) (solid line).
3
1
2.5 0.5 2 0 1.5 −0.5 1 −1
−1.5 0
0.5
1
2 (a)
3
0 0
4
3.14 (b)
6.28
2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5
−2
−1
0 (c)
1
2
Figure 1: Estimation of the lter obtained with SAEM, (a) Estimation of the impulse response of the lter, (b) Estimation of the modulus of the transfer function of the lter, (c) Estimation of the roots of the transfer function. The unit circle and the trajectories of the estimated roots are represented in the complex plan. ? : the original roots, : the initialization, o : the estimation.
3.5 1.5 3 1 2.5 0.5 2 0 1.5 −0.5
1
−1
−1.5 0
0.5
1
2 (a)
3
0 0
4
3.14 (b)
6.28
2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5
−2
−1
0 (c)
1
2
Figure 2: Estimation of the lter obtained with SAEM and another initial guess, (a) Estimation of the impulse response of the lter, (b) Estimation of the modulus of the transfer function of the lter, (c) Estimation of the roots of the transfer function. ? : the original roots, : the initialization, o : the estimation.
2 1.5 1 0.5 0 −0.5 −1 −1.5 −2
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Figure 3: Estimation of the roots of the transfer function, obtained with the simulated annealing version of SAEM. ? : the original roots, : the initialization, o : the estimation.
5000
4000
3000
2000
1000
0
−1000
−2000
−3000 0
2
4
6
8
10
12
14
16
18
20
Figure 4: Estimation of the wavelet, obtained from a real data set of seismic traces. The positive part is shaded.
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4 5
10 (a)
15
20
0.6
5
10 (b)
15
20
3
0.4 2
0.2 0
1
−0.2 −0.4 5
10 (c)
15
20
0 0
1.57 (d)
3.14
Figure 5: (a) The original lter of length L = 20 and the initialization (a spike at t = 5), (b) estimation obtained with SAEM, (c) estimation obtained with the simulated annealing version of SAEM, (d) comparison of the modulus of the transfer functions of the lters displayed in (b) (dashed line) and (c) (solid line).