Generalized Multiple-Model Adaptive Estimation ... - John L. Crassidis

Report 3 Downloads 27 Views
Generalized Multiple-Model Adaptive Estimation Using an Autocorrelation Approach Badr N. Alsuwaidan National Satellite Technology Program, KACST, Riyadh, Saudi Arabia

John L. Crassidis University at Buffalo, State University of New York, Amherst, NY 14260-4400

Yang Cheng Mississippi State University, Mississippi State, MS 39762 Abstract In this paper a generalized multiple-model adaptive estimator is presented that can be used to estimate unknown model and/or filter parameters, such as the noise statistics in filter designs. The main goal of this work is to provide an increased convergence rate for the estimated model parameters over the traditional multiple-model adaptive estimator. Parameter elements generated from a quasi-random sequence are used to drive multiple parallel filters for state estimation. The current approach focuses on estimating the process noise covariance by sequentially updating weights associated with the quasi-random elements through the calculation of the likelihood function of the measurement-minus-estimate residuals, which also incorporates correlations between various measurement times. For linear Gaussian measurement processes the likelihood function is easily determined. A proof is provided that shows the convergence properties of the generalized approach versus the standard multiple-model adaptive estimator. Simulation results, involving a two-dimensional target tracking problem and a single-axis attitude problem, indicate that the new approach provides better convergence properties over a traditional multiple-model approach. Index Terms Generalized Multiple-Model Adaptive Estimation, Correlation, Residual.

I. I NTRODUCTION Multiple-model adaptive estimation (MMAE) uses a parallel bank of filters to provide multiple estimates, where each filter corresponds with a dependence on some unknowns. The state estimate is provided through a sum of each filter’s estimate weighted by the likelihood of the unknown elements conditioned on the measurement sequence. The likelihood function gives the associated hypothesis that each filter is the correct one. Many applications of MMAE approaches exist [1]. Three generations of MMAE have been characterized. The first is the classical approach presented in Ref. 2. The second is the interacting multiple-model algorithm [3] and the third is variable structure multiple-model estimation [4]. Judiciously determining the filter bank size has also been an active area of research [5]. Adaptive filtering can be divided into four general categories: Bayesian, maximum likelihood, covariance matching, and correlation approaches [6]. Bayesian and maximum likelihood methods may be well suited to multiple-model approaches, but sometimes require large computational loads. Covariance matching Badr Alsuwaidan is a Research Assistant Professor at the Space Research Institute, KACST. Email: [email protected] John Crassidis is a Professor in the Department of Mechanical & Aerospace Engineering. Email: [email protected] Yang Cheng is an Assistant Professor in the Department of Aerospace Engineering. Email: [email protected]

1

is the computation of the covariances from the residuals of the state estimation problem, but has been shown to give biased estimates of the true covariances. A widely used correlation-based approach for a linear Kalman filter with stationary/Gaussian process and measurement noise is based on “residual whitening” [7]. In particular, the autocorrelation matrix, which can be computed from the measurementminus-estimate residuals, is used with the system state matrices to provide a least-squares estimate of the Kalman filter error covariance times the measurement output matrix. If the number of unknowns in the process noise covariance is equal to or less than the number of states times the number of outputs, then the error-covariance/output-matrix estimate can be used to find an estimate of the process noise covariance by solving for a set of linear equations. These equations are not linearly independent and one has to choose a linearly independent subset of these equations [7]. Standard MMAE approaches use only the current time measurement-minus-estimate residual in the algorithm. The approach shown in this paper is a generalization of the approach shown in Ref. [8], which uses the time correlation of the filter residuals to assign the likelihood for each of the modeled hypotheses. In particular, the spectral content of the residuals is used and only scalar measurements are assumed in Ref. [8]. The authors also state that if multiple measurements are available, then a diagonal matrix can be used with elements given by the spectral content of each measurement residual, but this assumes that the cross-correlation terms are negligible. Also, the focus of their paper is on the detection of actuator failures with known control-input frequency content. The main goal of this work is to provide an increased convergence rate for the estimated parameters over the traditional multiple-model adaptive estimator. The new approach shown here, called generalized multiple-model adaptive estimation (GMMAE), is based on calculating the time-domain autocorrelation function [7], which is used to form the covariance of a generalized residual involving any number of backward time steps. This covariance matrix also includes the time correlated terms, thus providing a more rigorous approach than the approach shown in Ref. [8]. The unknown elements in the design are the parameters of the process noise covariance, but the approach shown here is general to any MMAE application and can be used to estimate any model parameters. Process noise covariance elements can be drawn from any sample distribution as long as the resulting covariance matrix remains positive semidefinite. A theoretical proof of the convergence properties of the GMMAE is also shown, which offers insight as to its advantages over the standard MMAE approach. The organization of the remainder of this paper proceeds as follows. First, the standard Kalman filter equations are summarized, since this filter will be used in the simulations. Then, a review of the standard MMAE algorithm is given. Next, the new adaptive approach is shown, including the assumptions used for linear time-varying systems. The theoretical background for the GMMAE approach and the proof of convergence are then provided. Finally, simulation results involving a two-dimensional target tracking problem and a single-axis attitude estimation problem are shown. II. K ALMAN F ILTER AND M ULTIPLE -M ODEL A DAPTIVE E STIMATION A summary of the discrete Kalman filter is given in Table I, where xk is the state vector, uk is the known control input, Φk is the discrete-time state transition matrix, Υk is the process noise distribution matrix, wk is the process noise vector which is assumed to be a zero-mean Gaussian noise process with ˜ k is the discrete-time measurement, vk is the measurement noise vector which is assumed covariance Qk , y ˆ− ˆ+ to be a zero-mean Gaussian noise process with covariance Rk , x k and x k are the propagated and updated − + state estimates, respectively, and Pk and Pk are the propagated and updated covariances, respectively. Multiple-model adaptive estimation is a recursive estimator that uses a bank of M filters that depend on some unknown parameters, denoted by the vector p, which is assumed to be constant (at least throughout the interval of adaptation). The MMAE process is shown in Figure 1. Details of the MMAE approach can

2

TABLE I D ISCRETE -T IME L INEAR K ALMAN F ILTER

Model

xk+1 = Φk xk + Γk uk + Υk wk , ˜ k = Hk xk + vk , y

wk ∼ N (0, Qk )

vk ∼ N (0, Rk )

Initialize

ˆ (t0 ) = x ˆ0 x  ˜ (t0 ) x ˜ T (t0 ) P0 = E x

Gain

Kk = Pk− HkT [Hk Pk− HkT + Rk ]−1 ˆ− ˆ+ ˆ− x yk − Hk x k] k = xk + Kk [˜

Update

Pk+ = [I − Kk Hk ]Pk− ˆ+ ˆ− x k+1 = Φk xk + Γk uk

Propagation

uk

− T Pk+1 = Φk Pk+ ΦT k + Υk Q k Υk

Unknown System

yk

Real System xˆ k (1)

KF 1

MMAE

!

e(1) k xˆ k (2)

KF 2

!

xˆ k

e(2) k xˆ k ( M )

KF M Ck(1),0 Ck(2) ,00 ) Ck( M ,0

Fig. 1.

e(kM )

Likelihood

(1) k (2) k

(M ) k

MMAE Process

be found in Refs. [9] and [10]. Note that we do not need to make the stationary assumption for the state and/or output processes though, i.e. time-varying state and output matrices can be used. It is assumed that the unknown parameters belong to a discrete set of elements generated for each of the M filters from some known probability density function (pdf) of p, denoted by p (p), to give {p(ℓ) ; ℓ = 1, . . . , M }. The goal of the estimation process is to determine the conditional pdf of the ℓth element p(ℓ) given all

3

the measurements. Application of Bayes’ rule yields ˜ k , p(ℓ) ) p (Y ˜ k) p (Y ˜ p (Yk |p(ℓ) ) p (p(ℓ) ) = M X ˜ k |p(j) ) p (p(j) ) p (Y

˜ k) = p (p(ℓ) |Y

(1)

j=1

˜ k denotes the sequence {˜ ˜1, . . . , y ˜ k }. The a posteriori probabilities can be computed through where Y y0 , y [11] ˜ k−1 ) yk , p(ℓ) |Y ˜ k ) = p (˜ p (p(ℓ) |Y ˜ k−1 ) p (˜ yk |Y −(ℓ)

=

˜ k−1 ) p (˜ yk |ˆ xk ) p (p(ℓ) |Y M i Xh −(j) ˜ k |ˆ ˜ k−1 ) p (Y x ) p (p(j) |Y

(2)

k

j=1

−(ℓ) ˜ k−1 , p(ℓ) ) given by p (˜ with p (˜ yk , |Y yk |ˆ xk ) in the Kalman recursion. It is computed as a pdf of the (ℓ) ˆ− ˜ k − Hk x Kalman filter measurement-minus-estimate residual ek = y k . Suppose that ek is a zero-mean (ℓ) Gaussian random vector with covariance Ck,0 . Then   q 1 T  (ℓ) −1 (ℓ) (ℓ) ˜ p (˜ yk , |Yk−1 , p ) = 1/ det[2π Ck,0 ] exp − ek Ck,0 (3) ek 2

˜ k ) is a pdf. The a Note that the denominator of (2) is just a normalizing factor to ensure that p (p(ℓ) |Y (ℓ) posteriori probability update in (2) recursion formula can now be cast into a set of defined weights ̟k , so that (ℓ)

(ℓ) ̟k

=

−(ℓ)

̟k−1 p (˜ yk |ˆ xk M X

(j)

)

−(j)

̟k−1 p (˜ yk |ˆ xk

)

(4)

j=1

(ℓ) ̟k

˜ k ). Note that only the current time measurement y ˜ k is needed to update the where ≡ p (p(ℓ) |Y (ℓ) weights. The weights at time t0 are initialized to ̟0 = 1/M for ℓ = 1, 2, . . . , M . The convergence properties of MMAE are shown in Ref. [11], which assumes ergodicity in the proof. The ergodicity assumptions can be relaxed to asymptotic stationarity and other assumptions are even possible for nonstationary situations [12]. −(j) ˆk : The conditional mean estimate is the weighted sum of the parallel filter estimates x ˆ− x k =

M X

(j) −(j)

ˆk ̟k x

(5)

j=1

Also, the error covariance of the state estimate can be computed using   M T  X −(ℓ) −(j) (j) −(j) − ˆ ˆ ˆk − x ˆ− + P x − x ̟k x Pk− = k k k k j=1

(6)

4

ˆ k , and error covariance, denoted by Pk , are given by The specific estimate for p at time tk , denoted by p ˆk = p

M X

(j)

̟k p(j)

(7a)

j=1

Pk =

M X

(j)

̟k

j=1

T   ˆk ˆ k p(j) − p p(j) − p

(7b)

ˆ k . The discrete distribution of the Equation (7b) can be used to define 3σ bounds on the estimate p uncertainty of the parameter vector p is represented by M elements of p(j) . If M is large and the significant regions of the parameter space of p are well represented by p(j) , then (7a) is a good approximation of the conditional mean of p. III. A DAPTIVE L AW BASED

ON

AUTOCORRELATION

In this section the adaptive law, based on an autocorrelation approach, for the process noise covariance matrix is shown. First, the autocorrelation for time-varying systems is derived, followed by the associated likelihood functions for the defined measurement residuals. A. Autocorrelation for Time-Varying Systems In this section the autocorrelation matrix for linear time-varying systems is derived, which is an extension to the approach shown in Ref. [7]. Here we assume that the model is linear with xk+1 = Φk xk + Γk uk + Υk wk ˜ k = Hk xk + vk y

(8a) (8b)

Note that the to-be-estimated parameter vector p can be function of any unknown model quantity in (8). For example it can be related to parameters in the state model matrices or parameters related to the covariance of the process and/or measurement noise. Here we focus on elements of the process noise covariance. The process noise is assumed to be a zero-mean Gaussian white-noise process so that Qk is a constant, denoted by Q. Consider the following discrete-time residual equation: ˆ− ˜ k − Hk x ek ≡ y k ˜− = −Hk x k + vk

(9)

˜ k−1 with Kk−1 being an ˜− ˆ− ˆ− ˆ− where x k ≡ x k − xk and x k = Φk−1 (I − Kk−1 Hk−1 ) x k−1 + Φk−1 Kk−1 y arbitrary gain. Note that Kk−1 is not necessarily the optimal Kalman gain, which is computed using the actual noise statistics. The following autocorrelation matrix can be computed:  − T  i=0 Hk Pk Hk + Rk Ck, i = (10)   − −T T  − T  ˜k x ˜ k−i Hk−i − Hk E x ˜ k vk−i Hk E x i>0  ˜− where Ck, i ≡ E ek eTk−i and E{·} denotes expectation. The propagation of x k is given by ˜− ˜− x k = Φk−1 (I − Kk−1 Hk−1 ) x k−1 + Φk−1 Kk−1 vk−1 − Υk−1 wk−1

(11)

5

Carrying (11) i steps back leads to   i Y ˜− =  ˜− x Φk−j (I − Kk−j Hk−j ) x k

k−i

j=1



"j−1 i Y X j=2

#

+

"j−1 i Y X j=2

#

Φk−ℓ (I − Kk−ℓ Hk−ℓ ) Φk−j Kk−j vk−j

ℓ=1

(12)

Φk−ℓ (I − Kk−ℓ Hk−ℓ ) Υk−j wk−j + Φk−1 Kk−1 vk−1 − Υk−1 wk−1

ℓ=1

where

i Y

Zk−j ≡ Zk−1 Zk−2 · · · Zk−i

(13)

j=1

Performing the expectations in the definition of Ck, i leads to [13]  Hk Pk− HkT + Rk i=0        − T i=1 Ck, i = Hk Φk−1 (Pk−1 Hk−1 − Kk−1 Ck−1, 0 )     h i   − H Qi−1 Φ T (I − K H ) Φk−i (Pk−i Hk−i − Kk−i Ck−i, 0 ) i > 1 k−j k−j k−j k j=1

(14)

where

− T Ck−i, 0 ≡ Hk−i Pk−i Hk−i + Rk−i

Pk−

(15)

− Pk−1

In the above equations, and are computed using the true covariance matrix Q of the process noise wk . If the Kk−j matrices are computed using the true Q, too, then Ck, i = 0 for i ≥ 1. Note that storage of the state model and covariance matrices to the k − i point is required to compute Ck, i in general. B. Likelihood Function In this section the likelihood function for the measurement residual is shown. First, the following residual is defined:   ek ek−1    (16) ǫk,i ≡  .   ..  ek−i

The likelihood function associated with ǫk,i is given by Lk,i =

  1 1 T −1 ǫ C ǫ exp − k,i 2 k,i k,i [det(2π Ck,i )]1/2

where Ck,i ≡ E{ǫk,i ǫTk,i } is given by  Ck, 0 T Ck,  T1  Ck,i = Ck, 2  ..  . T Ck, i

Ck, 1 Ck−1, 0 T Ck−1, 1 .. . T Ck−1, i−1

Ck, 2 Ck−1, 1 Ck−2, 0 .. . T Ck−2, i−2

 ··· Ck, i · · · Ck−1, i−1   · · · Ck−2, i−2    .. ..  . . ··· Ck−i, 0

(17)

(18)

6

When i = 0 the likelihood function reduces down to   1 T 1 − T −1 Lk,0 =  1/2 exp − 2 ek (Hk Pk Hk + Rk ) ek det[2π (Hk Pk− HkT + Rk )]

(19)

This likelihood is widely used in MMAE algorithms [9]. C. MMAE Adaptive Law The MMAE adaptive law is given by (ℓ)

(ℓ)

(ℓ)

̟k = ̟k−1 Lk,0 (ℓ)

̟k ←

(ℓ)

̟k M X (j) ̟k

(20)

j=1

−(ℓ)

since p (˜ yk |ˆ xk

(ℓ)

) = Lk,0 , which is defined by

  1 (ℓ)T 1 (ℓ) −(ℓ) T −1 (ℓ) exp − Lk,0 = n e (H P H + R ) e k k k o1/2 k k 2 k −(ℓ) det[2π (Hk Pk HkT + Rk )]

(ℓ)

(21)

−(ℓ)

ˆk . ˜ k − Hk x where ek ≡ y (ℓ) (ℓ) (ℓ) The actual statistics of ek are determined by Q and Kk−j . The Kalman gains Kk−j are computed under the ℓth hypothesis that the true covariance of the process noise is Q(ℓ) . In the likelihood function −(ℓ) (ℓ) above, the hypothesized residual covariance (Hk Pk HkT + Rk ) of ek is computed using Q(ℓ) and (ℓ) Kk−j . The actual statistics of the residuals match the hypothesized only when Q(ℓ) = Q. D. GMMAE Adaptive Law The GMMAE adaptive law is a modification of the MMAE adaptive law, given by (ℓ)

(ℓ)

(ℓ)

̟k = ̟k−1 Lk,i (ℓ)

̟k ←

(ℓ)

̟k M X (j) ̟k

(22)

j=1

with (ℓ) Lk,i

(ℓ)

(ℓ)

1

= h  i1/2 (ℓ) det 2π Ck,i (ℓ)T

(ℓ)T

 1 (ℓ)T  (ℓ) −1 (ℓ) ǫk,i exp − ǫk,i Ck,i 2 

(ℓ)T

(23)

ek−1 · · · ek−i ]T , of which the actual statistics are determined where ǫi is defined as ǫi ≡ [ek (ℓ) (ℓ) by Q and Kk−j , which are computed using Q(ℓ) . The matrix Ck,i is given by (14) and (18), which is evaluated at Q(ℓ) and the optimal Kalman gain, which is a function of the actual covariance Q. Because ˆ of it, computed using p ˆk. Q is unknown, it is approximated by the current best estimate Q Like the MMAE adaptive law, the GMMAE adaptive law is set up such the parameter sets used to compute Q(ℓ) that result in good agreement between the actual and expected residual statistics receive

7

large weights. On the right-hand side of (22), the more recent weight ̟k−1 instead of ̟k−i is used, which in essence raises the power of the weight and the likelihood function. As a result, the GMMAE estimate given by (7) of the unknown p is closer to the maximum of the likelihood than it is towards the conditional likelihood mean or the minimum mean-square error estimate as provided by MMAE. A technique of raising the power of the weights in the MMAE approach has also been proposed in Ref. [14]. But the GMMAE solution presented here provides a clearer understanding of how this is done and it also incorporates correlated terms to increase performance characteristics. (ℓ) (ℓ) The estimates for Ck, i , which are sub-blocks of Ck,i , are given by  −(ℓ)  Hk Pk HkT + Rk i=0           −(ℓ) T  ˆ k−1 C (ℓ) Hk Φk−1 Pk−1 Hk−1 − K i=1 k−1, 0 (ℓ) (24) Cˆk, i =   i  h  Qi−1   ˆ  Φk−i Hk  j=1 Φk−j I − Kk−j Hk−j      (ℓ) × P −(ℓ) H T − K ˆ k−i C i>1 k−i k−i k−i, 0

where

(ℓ)

−(ℓ)

T Ck−i, 0 ≡ Hk−i Pk−i Hk−i + Rk−i

−(ℓ)

The covariance matrix Pk

(25)

is computed using −(ℓ)

(ℓ)

Kk

+(ℓ)

Pk+1 = Φk Pk ΦTk + Q(ℓ)   −(ℓ) +(ℓ) (ℓ) Pk = I − Kk Hk Pk −1  −(ℓ) −(ℓ) = Pk HkT Hk Pk HkT + Rk

where Q(ℓ) is computed using p(ℓ) . The estimate of the optimal gain is computed using −1  ˆ k = Pˆ − H T Hk Pˆ − H T + Rk K k k k k

(26a) (26b) (26c)

(27)

with

− ˆk Pˆk+1 = Φk Pˆk+ ΦTk + Q   ˆ k Hk Pˆ − Pˆk+ = I − K k

(28a) (28b)

˜ k , along with the ℓth element, p(ℓ) , 1 ≤ ℓ ≤ M , a bank of filters is Using the current measurement, y −(ℓ) (ℓ) ˆ k , and measurements are used to form the residual, ǫk,i , executed. For each filter the state estimates, x −(ℓ) −(ℓ) −(ℓ) going back i steps. The filter error covariance, Pk , and state matrices, Φk and Hk , are used to (ℓ) update the estimate of the autocorrelation, denoted by Cˆk,i . Note that at each new measurement time, all (ℓ) ˆ k is provided, which is used to compute elements of Cˆk,i need to be recalculated since a new estimate p an estimate of the optimal gain. Unfortunately, this can increase the computational costs. The diagonal elements do not need to be recomputed though, since they are not a function of the optimal gain. The (ℓ) residuals and autocorrelations are then used to evaluate the likelihood functions Lk,i . These functions are ˆ k using (7a). used to update the weights, which gives the estimate p

8

IV. GMMAE C ONVERGENCE1 A proof of convergence of the GMMAE approach for linear systems is shown, as well as its relationship to the standard MMAE approach. A. Autocorrelation of the Residuals One way to estimate the correlation of the residuals (the product of two residual values) is by using the ergodic average (time average). However, the GMMAE approach is a realtime adaptive approach and the estimate of the correlation of the residuals has to be determined and provided to the GMMAE likelihood function to test the new measurement-residual optimality for the given parameter p(ℓ) . The GMMAE estimate Cˆk,i provides an estimate for the correlation of the residuals. The residuals of the optimal Kalman filter, in which the Kalman gain is computed using the actual noise covariances, are uncorrelated. However, the residuals of the suboptimal Kalman filter are correlated. There are different ways to estimate the autocorrelation. One way uses the ergodicity property of a stationary random process. The ergodic estimate of the autocorrelation is given by N 1 X ej eTj−i C¯k−1, i = N j=i

(29)

where N is a large number of samples. The ergodic estimate of the autocorrelation C¯k−1, i can be used ˆ −T to estimate the true state error {ˆ x− k−1 x k−1 } which is the approach introduced in Ref. [7]. The estimate of − −T T ˜ k−1 } cannot typically be determined, but rather {˜ ˜ −T {˜ xk−1 x x− k−1 x k−1 }Hk−i is determined instead. The term \T −(ℓ) T ˜ −T {˜ x− k−1 x k−1 }Hk−i is denoted by Pk−i Hk−i and can be estimated from (14) using the autocorrelation ergodic estimate C¯k−1, i : \T (ℓ) T (ℓ) T −(ℓ) (ℓ) (ℓ) [C¯k, i ......C¯k−i+1, 1 ]T = Ak,i [Pk−i Hk−i − Kk−i Ck−i, 0 ]

(30)

where Ak,i is defined as

Ak,i

i    hQ i−1 ˆ Φk−i Hk j=1 Φk−j I − Kk−j Hk−j    .. =   . Hk−i+1 Φk−i

(31)

ˆ k−1 would be the optimal gain to minimize the state error in a least-squares The estimated optimal gain K \ ˆ k−i = P −(ℓ) H T C −1 , the optimal gain is estimated as sense. Noticing K k−i k−i k−i, 0 (ℓ) T (ℓ) T ˆ k−i − K (ℓ) ] C (ℓ) [C¯k, i ......C¯k−i+1, 1 ]T ∼ = Ak,i [K k−i k−i, 0

(32)

(ℓ) (ℓ) T (ℓ) T (ℓ) −1 † ˆ k−i ∼ K = Kk−i + Ak,i [C¯k, i ......C¯k−i+1, 1 ]T Ck−i, 0

(33)

where A† is the pseudo-inverse of A. This provides another estimate to the autocorrelation of the residuals of a suboptimal filter. The GMMAE approach is a recursive adaptive approach. The result of the previous time evaluated to determine the current estimate will be used to predict future estimate. The GMMAE estimate of the ˆ k−1 in (24) is determined by the weight ̟k−1 , which is a measure of the likelihood of optimal gain K 1 This

section is the work of the first two authors.

9

ek eT k−i

ˆ (ℓ) C k+1, i 0

¯ (ℓ) C k, i 0

Fig. 2.

k k+1 Time

1

N

Correlation of the Residuals of an Optimal Kalman Filter

ˆ− ˆ− the residual ek−1 = [˜ yk−1 − Hk x k ] and the previous residuals. The estimate x k−1 is the Kalman filter + ˆ k−2 . Therefore, the weight ̟k−1 is a measure of the performance of ˆ k−2 updated by K propagation of x ˆ k−2 since K ˆ k−1 has not contributed to the residual ek−1 yet. The filter that has the highest weight ̟k−1 K ˆ k−2 → K (ℓ) as Pˆk−2 → P −(ℓ) . The has the minimum residual ek−1 and the Kalman gain estimate K k−2 k−2 ˆ k−2 is most likely to be the optimal gain. ˜ k−1 complies more with the estimate x ˆ− measurement y , where K k ˆ k−1 given by the GMMAE algorithm in (24) is actually a propagation The estimate of the optimal gain K ˆ k−2 determined by ̟k−1 based on the likelihood evolution of the residuals performance of the gain of K ˆ k−2 as shown in (27). Any estimate based on the weight ̟k−1 refers to K ˆ k−2 in reality. For estimate K (ℓ) the suboptimal filter, the ergodic estimate of the correlation of the residuals, C¯k−1, 1 , is given using (32) by (ℓ) ˆ k−2 − K (ℓ) ] C (ℓ) C¯k−1, 1 = Ak−1,1 [K k−2 k−2, 0

(34)

(ℓ) The GMMAE estimate of the autocorrelation Cˆk, i in (24), that is used in the GMMAE likelihood, is ˆ k−1 : based on the gain estimate of K (ℓ) T

(ℓ) T

−(ℓ)

(ℓ)

(ℓ)

T [Cˆk, i ......Cˆk−i+1, 1 ]T = Ak,i [Pk−i Hk−i − Kk−i Ck−i, 0 ]

(35)

In the GMMAE hypothesis, the likelihood of the residual is evaluated with respect to its optimal expected values. The assumption of optimality of the parameter p(ℓ) is essential to compute the likelihood. Even though the zero residual correlation would be the obvious choice for optimality, the estimate of the (ℓ) (ℓ) correlation, Cˆk, i , is based on the available information of the previous correlation C¯k−1, i carried by the ˆ (ℓ) . The actual correlation of the residuals of the optimal filter GMMAE estimate of the optimal gain K k−i (ℓ) is a random variable and has zero mean. Basically, the optimal filter correlation of the residuals, Cˆk, i , (ℓ) is expected to compensate for the previous correlation as shown in Figure 2. If the parameter p is an optimal parameter, the correlation of the residuals has zero mean while the actual correlation fluctuates around this mean. Considering a linear time-invariant system, and substituting the Kalman gain estimate ˆ k−2 from (34) into (35), where K ˆ k−1 = K ˆ k−2 at steady state, leads to K (ℓ) −(ℓ) T (ℓ) (ℓ) (ℓ) −1 (ℓ) Cˆk, 1 = Ak,1 [Pk−1 Hk−1 − (Kk−2 + A†k−1,1 C¯k−1, 1 ) Ck−2, 0 Ck−1, 0 ] (ℓ) (ℓ) (ℓ) −1 (ℓ) Cˆk, 1 = −Ak,1 A†k−1,1 [C¯k−1, 1 Ck−2, 0 ] Ck−1, 0 (ℓ) Cˆk, 1

=

(36)

(ℓ) −C¯k−1, 1

(ℓ) The predicted estimate Cˆk, 1 has the opposite sign of the ergodic estimate [15]. To ensure the optimality of the Kalman filter, the expected future correlation has to compensate any divergence of the previous

10

(ℓ) correlation of the zero mean. The autocorrelation Cˆk, 1 is incorporated in the GMMAE likelihood to increase the chance that the parameter p(ℓ) is chosen as the optimal one. For the case that the residuals are produced by a suboptimal filter, the correlation offset from zero mean is not only a result of the randomness of the residual but also a result of the mis-modeling of the filter parameter. The correlation (ℓ) for the suboptimal filter will converge at the limit to the ergodic correlation estimate C¯k, 1 and the relation given in (36) will serve as a tool in the GMMAE adaptive law to distinguish between the optimal filter and suboptimal filter residuals by testing for the whiteness of the filter residuals.

B. GMMAE Convergence We expect that for the GMMAE approach to converge, one of the elements p(ℓ) is nearly equal to ˜ k,i ) → 1 as k → ∞ and p (p(ℓ) |Y ˜ k,i ) → 0 as k → ∞ for the true parameter ptrue so that p (ptrue |Y p(ℓ) 6= ptrue . Also, the p(ℓ) elements that have values far from the true parameter have faster rates of ˜ k,i ) → 0 as k → ∞, than the ones closer to the true parameter. This result convergence to zero, p (p(ℓ) |Y will be proved with the assumption of ergodicity of residuals and the use of the following lemma of standard matrix theory [12]. Let A and B be two n × n positive definite matrices, then   |A| − tr[B −1 A] ≤ 0 (37) n + ln |B| where tr denotes the matrix trace. The convergence of a linear time-invariant model is first considered here. Recall the adaptive law for the GMMAE approach: (ℓ)

(ℓ)

Lk,i ̟k−1

(ℓ)

̟k =

M X

(j)

(j)

Lk,i ̟k−1

(38)

j=1

For simplicity the normalization in the denominator can be replaced by the dominating weight since all others will converge to zero eventually: (ℓ)

(ℓ) ̟k

=

(ℓ)

Lk,i ̟k−1 Ltrue k,i

true where Ltrue . Substituting for Lk,i from (23) into (39) gives k,i is the likelihood evaluated using p    −1 (ℓ) −1/2 (ℓ)T (ℓ) (ℓ) ǫk,i exp − 21 ǫk,i Ck,i Ck,i (ℓ) (ℓ)  ̟k−1  ̟k =  −1 (O) −1/2 (O)T (O) (O) Ck,i ǫk,i exp − 21 ǫk,i Ck,i (O)

(39)

(40)

where Ck,i is the autocorrelation matrix associated with the optimal filter using ptrue . Then, taking the natural logarithm of both sides gives  (O)  !  (ℓ)  −1  1   −1  ̟k 1  Ck,i  1 (ℓ) (ℓ)T (ℓ) (O) (O)T (O) ln = ln − tr ǫk,i ǫk,i Ck,i + tr ǫk,i ǫk,i Ck,i (41) (ℓ) (ℓ) 2 2 2 ̟ C k−1

k,i

11

Next, extending this equation by going backwards in time though the recursive law in (39) yields

2k −1 ln

(ℓ) ̟k (ℓ) ̟1

!

    (O) k k     Ck,i X X −1 −1 1 (ℓ) (ℓ)T (O) (O)T (ℓ) (O)  + tr  1  (42) − tr  ǫk,i ǫk,i Ck,i ǫk,i ǫk,i Ck,i = ln (ℓ) k k Ck,i j=1 j=1

Rearranging (42) yields the following form: (ℓ)

2k −1 ln

̟k

(ℓ)

̟k−1

!

   (O)  k i h  −1 Ck,i X −1 1 (O) (ℓ) (ℓ) (ℓ)T (O) (ℓ)   − tr Ck,i + tr(I) − tr  Ck,i (ǫk,i ǫk,i − Ck,i ) Ck,i = ln  (ℓ) k Ck,i j=1 (43) (ℓ)

where I is the identity matrix of the size of Ck,i . The first three terms in (39) are non-positive values k X (O) (O)T (O) 1 ǫk,i ǫk,i → Ck,i as k → ∞. Let us rewrite (43) in based on (37). The ergodicity principle gives k j=1

shorthanded form as

(ℓ)

2k

−1

ln

̟k

(ℓ) ̟1

!

= −α(ℓ) − β (ℓ)

(44)

where  (O)  h i Ck,i (O) (ℓ) −1  + tr Ck,i C − tr(I) α(ℓ) = − ln  k,i (ℓ) Ck,i   k  −1 X 1 (ℓ) (ℓ)T (O) (ℓ)  (ǫ ǫ − Ck,i ) Ck,i β (ℓ) = tr  k j=1 k,i k,i

(45)

The terms α(ℓ) are non-negative if p(ℓ) 6= ptrue . This leads to (ℓ)

̟k

(ℓ) ̟1

1

(ℓ)

= e− 2 (α

+β (ℓ) )k

Note, α(O) and β (O) are strictly zero for the optimal case as

1 k

k X

(46)

(ℓ) (ℓ)T

(ℓ)

(O)

ǫk,i ǫk,i → Ck,i → Ck,i , which leads

j=1

(ℓ)

to

̟k

(ℓ)

̟0

→ 1 as k → ∞.

It is worth investigating a special case of the GMMAE approach where the correlation between residuals is not incorporated into the likelihood. In the uncorrelated case the autocorrelation matrix is a block

12

(ℓ)

diagonal matrix and Ck, i = 0 for i 6= 0. For this special case, α and β can be reduced to  (O)  i h Ck,i (O) (ℓ) −1 (ℓ)  + tr Ck,i − tr(I) C αU = − ln  k,i U (ℓ) Ck,i U    (O)  k h i C X −1 s, 0 (O) (ℓ) − ln   = − tr(I) (ℓ) + tr Cs, 0 Cs, 0 Cs, 0 s=k−i    (O)  h i Ck, 0 −1 (O) (ℓ) (ℓ)  + tr Ck, − tr(I) = (i + 1)α0 ≡ (i + 1) − ln  0 Ck, 0 (ℓ) Ck, 0 (ℓ)

βU

 j k  −1 X X 1 (O) (ℓ)  (e(ℓ) e(ℓ)T − Cs, 0 ) Ck, 0 = tr  k j=1 s=j−i s s   k −1  X 1 (ℓ) (O) (ℓ)T (ℓ)  = (i + 1)β0(ℓ) ≡ (i + 1)tr  − Cj, 0 ) Cj, 0 (e e k j=1 j j

(47)



(48)

and the convergence rate will be

(ℓ)

̟k

1

(ℓ)

= e− 2 (α0

(ℓ) ̟1

(ℓ)

+β0 )(i+1)k

(49)

Setting i = 0 compares this result with the convergence rate of the traditional MMAE approach: (ℓ)

̟k

(ℓ) ̟1

1

(ℓ)

= e− 2 (α0

(ℓ)

+β0 )k

(50)

This shows that even without considering the autocorrelation in GMMAE approach, it converges faster than the traditional MMAE. Incorporating the autocorrelation in the GMMAE makes the convergence to the optimal parameter faster. This shown by comparing the convergence of GMMAE to the uncorrelated GMMAE as follows:  h i !  ! −1/2 −1 (ℓ) 1 (ℓ)T (ℓ) (ℓ) |C | exp − ǫ (C ) ǫ k,i k,i k,i 2 k,i ̟k ̟k iG h (51) =  (ℓ) (ℓ) (ℓ)T −1 (ℓ) −1/2 1 ̟k−1 G ̟k−1 U |Ck,i | exp − 2 ǫk,i (Ck,i ) ǫk,i U

where the subscripts U and G refer to the uncorrelated GMMAE and the standard form of the GMMAE, respectively. Taking the natural logarithm of both sides in (51) gives ! ! (ℓ) (ℓ) (ℓ) i h Ck,i ̟k ̟k (ℓ) (ℓ)T −1 G 2 ln − tr ǫ ǫ (C ) − 2 ln = − ln k,i k,i k,i (ℓ) (ℓ) |Ck,i |U G (52) ̟k−1 G ̟k−1 U i h (ℓ) (ℓ)T −1 + tr ǫk,i ǫk,i (Ck,i ) U

13

(ℓ) (ℓ)T

If p(ℓ) is the true parameter ǫk,i ǫk,i → (Ck,i )G , where the off-diagonal terms are the predicted estimates C¯k, i , and thus ! ! (ℓ) (ℓ) h i |Ck,i |G ̟k ̟k −1 − trI + tr (C ) (C ) − 2 ln = − ln 2 ln (53) k,i k,i G U (ℓ) (ℓ) |Ck,i |U ̟k−1 G ̟k−1 U (ℓ)

The right side of (53) yields a positive quantity which means the weight ̟k is higher in the GMMAE if p(ℓ) is true parameter. This shows that the GMMAE approach converges faster toward the optimal estimate than the uncorrelated GMMAE. The autocorrelation helps the generalized adaptive law to makeup for the decline in the convergence rate in the previous steps due to the residuals randomness. However, if p(ℓ) is not the true parameter then the weight convergence is given from (43):  (O)  ! (ℓ) i h Ck,i ̟ (O) (ℓ) −1 k   − tr Ck,i + tr(I) 2k −1 ln C = ln k,i G (ℓ) (ℓ) ̟1 Ck,i G (54)   k   X −1 1 (ℓ) (ℓ)T (O) (ℓ)  − tr  (ǫ ǫ − Ck,i ) Ck,i k j=1 k,i k,i G Let [ǫ¯i ǫ¯Ti ](ℓ) =

1 k

k X

(ℓ) (ℓ)T

ǫk,i ǫk,i and rewrite (54) as

j=1

2k −1 ln

(ℓ)

(ℓ) ̟k (ℓ) ̟1

(ℓ)

!

 (O)  h i Ck,i (O) (ℓ) −1  − tr Ck,i C = ln  k,i G (ℓ) Ck,i G i h (O) (ℓ) −1 + tr(I) − tr ([ǫ¯i ǫ¯Ti ](ℓ) − Ck,i )Ck,i G

(55)

(ℓ)

Let [ǫ¯i ǫ¯Ti ](ℓ) = [ǫ¯i ǫ¯Ti ]d + [ǫ¯i ǫ¯Ti ]f where [ǫ¯i ǫ¯Ti ]d contains the diagonal elements of matrix [ǫ¯i ǫ¯Ti ](ℓ) (ℓ) and [ǫ¯i ǫ¯Ti ]f contains the off-diagonal elements, so that  (O)  ! (ℓ) i h Ck,i ̟k (O) (ℓ) −1  − tr Ck,i = ln  2k −1 ln + tr(I) C k,i (ℓ) G (ℓ) ̟1 Ck,i (56) G i i h h (ℓ) (ℓ) −1 (ℓ) (O) (ℓ) −1 − tr ([ǫ¯i ǫ¯Ti ]d − Ck,i )Ck,i G − tr ([ǫ¯i ǫ¯Ti ]f )Ck,i G  As stated earlier the estimate of the off-diagonal elements E ek eTk−i = C¯k, i is given in (29), assuming k is large enough [7]. Then, since the estimate C¯k, i = −Cˆk, i as given in (36), we have (ℓ)

(ℓ)

(ℓ)

[ǫ¯i ǫ¯Ti ]f = −Ck,i G + Ck,i U

Substituting (57) into (56) gives  (O)  ! (ℓ) i h Ck,i ̟ (O) (ℓ) −1 k   − tr Ck,i + tr(I) C = ln 2k −1 ln k,i G (ℓ) (ℓ) ̟1 Ck,i G i i h h (ℓ) (ℓ) (ℓ) −1 (ℓ) (O) (ℓ) −1 − tr ([ǫ¯i ǫ¯Ti ]d − Ck,i )Ck,i G + tr (Cˆk,iG − Cˆk,iU )Ck,i G

(57)

(58)

14

Now, (58), after algebraic rearrangement, becomes  (O)  ! (ℓ) i h Ck,i ̟k (O) (ℓ) −1   − tr Ck,i + tr(I) C = ln 2k −1 ln k,i G (ℓ) (ℓ) ̟1 Ck,i U i h (O) (ℓ) −1 T (ℓ) − tr ([ǫ¯i ǫ¯i ]d − Ck,i )Ck,i G  (ℓ)  h i Ck,i (ℓ) (ℓ) −1 G  + tr(I) − tr Ck,i − ln  C U k,i G (ℓ) Ck,i

(59)

U

which reduces down to

 (O)  i h Ck,i (ℓ) −1   + tr(I) − tr ([ǫ¯i ǫ¯Ti ](ℓ) )C = ln 2k −1 ln d k,i G (ℓ) (ℓ) ̟1 Ck,i G (60)  (ℓ)  h i Ck,i (ℓ) (ℓ) −1 G  + tr(I) − tr Ck,i − ln  C U k,i G (ℓ) Ck,i U i h i h −1 (ℓ) (ℓ) (ℓ) −1 (O) (O) (ℓ) −1 By adding the following terms −tr [ǫ¯i ǫ¯Ti ]d (Ck,i U − Ck,i U ) = 0 and −tr (Ck,i − Ck,i )Ck,i U = 0 to (60) and rearranging, we obtain  (O)  ! (ℓ) h i Ck,i ̟k (O) (ℓ) −1   − tr Ck,i = ln 2k −1 ln C + tr(I) k,i U (ℓ) (ℓ) ̟1 Ck,i U i h (O) (ℓ) −1 T (ℓ) − tr ([ǫ¯i ǫ¯i ]d − Ck,i )Ck,i U (61) i h (ℓ) −1 (ℓ) (ℓ) −1 − tr [ǫ¯i ǫ¯Ti ]d (Ck,i G − Ck,i U )  (ℓ)  i h Ck,i (ℓ) (ℓ) −1 U  − tr Ck,i + tr(I) C + ln  k,i G U (ℓ) Ck,i (ℓ)

̟k

!

G

Writing (61) in shorthand form gives

(ℓ)

2k

−1

ln

̟k

(ℓ)

̟1

!

= −[αU + βU + δGU + αGU ]

(62)

with h i |Ck,i |U −1 αGU = − ln − trI + tr (Ck,i )U (Ck,i )G |Ck,i |G i h (ℓ) (ℓ) −1 (ℓ) −1 δGU =tr [ǫ¯i ǫ¯Ti ]d (Ck,i G − Ck,i U )

(63a) (63b)

where αGU has non-negative value and δGU is a trace of a product of matrix differences. Since the (ℓ) term [ǫ¯i ǫ¯Ti ]d is a diagonal matrix, the trace carries the diagonal elements of the second term only. At (ℓ) convergence the autocorrelation matrix Ck,i G is positive definite and nearly a diagonal symmetric matrix

15

where (C −1 )i,i ≥ (Ci,i )−1 [16]. This shows δGU is a positive quantity. Then, the convergence rate is given as (ℓ)

̟k

(ℓ) ̟1

1

(ℓ)

=e− 2 [αU =e

(ℓ)

+βU +δGU +αGU ]k

(64)

(ℓ) (ℓ) − 21 [(α0 +β0 )(i+1)+δGU +αGU ]k

Incorporating the autocorrelation in the GMMAE approach shows faster convergence than the uncorrelated GMMAE and even faster than the traditional MMAE approach. This proves the convergence properties of the GMMAE approach using the Kalman filter as an elemental filter where a linear model is used with stationary measurement and/or process noise. Similarly, the convergence rate for the linear time-varying model can be obtained, by recalling (41), as   ! (O) 1/2  (ℓ)  −1  C ̟k  k,i  1 (ℓ) (ℓ)T (ℓ) = ln − tr ǫ ǫ C ln   k,i k,i k,i (ℓ) 2 (ℓ) 1/2 ̟k−1 (65) Ck,i     −1 1 (O) (O)T (O) Ck,i + tr ǫk,i ǫk,i 2 Rearranging (65) yields

2 ln

(ℓ) ̟k (ℓ) ̟k−1

!

which can be written in short hand as

 (O)  i h Ck,i (O) (ℓ) −1  − tr Ck,i + tr(I) C = ln  k,i (ℓ) Ck,i   −1  (ℓ) (ℓ)T (O) (ℓ) − tr (ǫk,i ǫk,i − Ck,i ) Ck,i   −1  (O) (O)T (O) (O) + tr (ǫk,i ǫk,i − Ck,i ) Ck,i (ℓ)

2 ln

̟k

(ℓ)

̟k−1

(ℓ)

(ℓ)

= −αk − βk

(66)

(67)

where (ℓ)

αk

(ℓ)

βk

 (O)  i h Ck,i (O) (ℓ) −1  + tr Ck,i − tr(I) C = − ln  k,i (ℓ) Ck,i    −1   −1  (ℓ) (ℓ)T (O) (ℓ) (O) (O)T (O) (O) = −tr (ǫk,i ǫk,i − Ck,i ) Ck,i + tr (ǫk,i ǫk,i − Ck,i ) Ck,i

(68)

(ℓ)

Then, to obtain

̟k

(ℓ)

̟1

, let α(ℓ) =

k k 1 X (ℓ) 1 X (ℓ) αj , β (ℓ) = β k j=1 k j=1 j

(69)

16

Then, the overall convergence rate is given as (ℓ)

̟k

(ℓ) ̟1

1

(ℓ)

= e− 2 (α

+β (ℓ) )k

(70)

The α term is always non-negative from (37). The non-negativeness of β still holds for linear time-variant (ℓ) (ℓ)T systems even if the βk terms are from different times. This is because the residual error (ǫk,i ǫk,i − (O)



−1

(ℓ) , which follows the normalized Kalman innovations of the Kalman filter. Ck,i ) is normalized by Ck,i The same approach is valid for the nonlinear case where the first-order Taylor series approximates the system functions. This is limited when the linearized Φk (ˆ xk ) and Hk (ˆ xk ) are bounded and provide close approximations to the real system. The extended Kalman filter (EKF) is derived with this assumption and the approach is assumed valid as long as the EKF is valid for the problem. Simulations using the EKF indicate that the non-negativeness property of β term is true however. Figure 3 shows the value 1 (ℓ) of e− 2 β for different p(ℓ) . The simulation is performed for the system in next section except p(ℓ) is linearly distributed equally in the range [0, 100] where the true parameter ptrue = 10. This shows that the GMMAE approach provides superior performance and that the correlated terms do play a role in the convergence.

1 0.9 0.8

0.6

1

e− 2 β

(ℓ)

0.7

0.5 0.4 0.3 0.2

GMMAE MMAE Uncorrelated

0.1 0

0

20

40

60

80

100

p(ℓ)

Fig. 3.

Convergence of Linearly Distributed p(ℓ)

V. N UMERICAL S IMULATIONS Two practical examples are presented in this section to compare the performance of the MMAE and GMMAE approaches for noise covariance adaptation. A. Target Tracking The state vector is chosen as the position and velocity in x and y directions, given by x = [x x˙ y y] ˙ T. The target motion model is given by a linear dynamical system: xk+1 = Φxk + wk

(71)

17

with

 1 0  Φ= 0 0

∆t 1 0 0

 0 0 0 0  1 ∆t 0 1

(72)

The sampling period ∆t = 0.01. The 4 × 4 process noise covariance matrix is parameterized by qx and qy , given by    3  ∆t /3 ∆t2 /2 02×2  qx ∆t2 /2 ∆t  3  (73) Qk =  2   ∆t /3 ∆t /2 02×2 qy 2 ∆t /2 ∆t

The true values of qx and qy are chosen as qx = qy = 10. In the MMAE and GMMAE approaches, the elements of qx and qy for the individual Kalman filters are generated using a two-dimensional Hammersley sequence under the assumption that qx and qy are independently uniformly distributed in [0, 100]. The number of elements in both the MMAE and the GMMAE approach is 250. The measurement model is given by   p x2 + y 2 ˜k = + vk (74) y arctan(y/x)

It is assumed that the noise terms in range and azimuth measurements are uncorrelated, so the measurement noise covariance matrix is diagonal, given by Rk = diag[rρ rA ]. We choose rρ = 0.01 and rA = 0.000001 in the simulations. Since the measurement model is nonlinear in the state vector, EKFs are used in the MMAE and GMMAE approaches with the sensitivity matrix Hk evaluated at the current estimate. The EKFs are provided with good initial estimates of the state vector, so they do not suffer any divergence problems. An alternative approach to working with the nonlinear measurement model is to first convert the original range and azimuth measurements to the effective position measurements on x and y and then apply the linear Kalman measurement update. In the latter approach, the covariance matrix for the effective measurement does not take as simple a form as the shown Rk and becomes data-dependent. The resulting residual sequence is not stationary or ergodic in either case. There are many possibilities for the chosen distribution of the process noise covariance parameters. A simple approach is to assume a uniform distribution simulated by pseudo-random numbers. We instead choose a Hammersley quasirandom sequence [17] due to its well distributed pattern. A comparison between the uniform distribution and the Hammersley quasi-random sequence for 500 elements is shown in Figure 4. The Hammersley quasi-random sequence provides a better “spread” of elements than the uniform distribution. The two adaptive estimators are run 50 seconds to process the same measurement data. As discussed in the previous section, the GMMAE approach goes back i time steps in order to form the residual. (ℓ) We choose i = 4 for the GMMAE approach. The size of the corresponding residual ǫk,i is 10 by 1. The results are given in Figures 5(a) and 5(b). It can be seen that both estimators converge within 50 seconds, but the MMAE approach takes more than twice as much time as the GMMAE approach to converge. Both estimators converge with a relative small set of elements in the above-mentioned example. A closer examination of the automatically-generated element set shows that of the 250 elements there is a element that is close to the true values of qx and qy . If all of the 250 elements are far away from the true values, then the MMAE and GMMAE approaches with 250 static elements may fail to yield satisfactory performance, since the estimates of qx and qy are given by the average of the elements. Increasing the element size is always a solution, but as the dimensionality of the problem increases, the computational complexity involved may quickly become prohibitive. Meanwhile, it should be noted that

18

100

80

80

60

60

p2

p2

(ℓ)

(ℓ)

100

40

40

20

20

0

Fig. 4.

0

20

40

(ℓ)

60

80

0

100

0

20

40

(ℓ)

60

80

100

p1

p1

(a) Uniform Distribution

(b) Hammersley Quasi-Random Sequence

Uniform Distribution and Hammersley Quasi-Random Sequence Comparison

most of the elements have almost zero weights at the end of the simulation and their contributions to the final estimates of qx and qy are negligible. 60

80 GMMAE MMAE

GMMAE MMAE 70

50 60 40

qy

qx

50

30

40

30 20 20 10 10

0

0

5

10

15

20

25

30

35

40

Time (Sec)

(a) Estimates of qx Fig. 5.

45

50

0

0

5

10

15

20

25

30

35

40

45

50

Time (Sec)

(b) Estimates of qy

Simulation of MMAE and GMMAE Estimates

B. Single-Axis Attitude Estimation The single axis attitude estimation problem is concerned with using angle-attitude measurements and rate information from gyros to estimate the attitude error and gyro drift. Two error sources are generally present in gyros. The first is a short-term component of instability, referred to as random drift, and the second is a random walk. These noise terms affect the Kalman filter estimates if they are not well represented in the model. The MMAE and GMMAE are used to estimate the statistical properties of these

19

noise terms. The attitude rate θ˙ is assumed to be related to the gyro output ω ˜ by θ˙ = ω ˜ − β − ηv

(75)

where β is the gyro drift rate and ηv is a zero-mean Gaussian white-noise process with spectral density given by σv2 . The drift rate is modeled by a random walk process, given by β˙ = ηu

(76)

where ηu is a zero-mean Gaussian white-noise process with spectral density given by σv2 . The parameters σv2 and σu2 are estimated by the GMMAE and MMAE approaches. The linear model used in the Kalman filter follows the attitude angle measurement model” y˜k = θk + υk

(77)

where υk a zero-mean Gaussian white-noise process with variance given by R = σn2 . The discrete-time system used in the Kalman filter can now be written as xk+1 = Φxk + Γ˜ ω + wk

(78)

y˜k = Hxk + υk

(79)

where xk = [θ, β]Tk , Γ = [∆t, 0], H = [1, 0], 

1 Φ= 0

 −∆t 1

and Q = E{wk wkT } =



σv2 ∆t + 31 σu2 ∆t3 − 21 σu2 ∆t2

− 21 σu2 ∆t2 σu2 ∆t



where ∆t is the constant sampling interval. Synthetic measurements are created using a true constant angle-rate given by θ˙ = 0.0001 rad/sec and a sampling period of 1 second. The noise parameters are given by σn = 17 × 10−6 rad, σu = 1 × 10−10rad/sec3/2 and σu = 1 × 10−7rad/sec1/2 . The initial bias β0 is given as 0.1 deg/hr. A set of only eleven filters run in parallel with the initial covariance matrix set to P0 = diag[1 × 10−4 , 1 × 10−12 ] for each. The only difference in the filters is the predetermined values for σu and σv , which are randomly selected in a neighborhood of the true values. As in the target tracking example, both MMAE and GMMAE are used to estimated the unknown parameters σu and σv . Figure 6 shows the performance of both approaches in estimating the attitude error and the gyro drift rate. Figures 7 and 8 show the estimates using the MMAE and GMMAE approaches, respectively, for 10 Monte Carlo runs. The results show that the GMMAE approach, using only four steps back, outperforms the MMAE approach by converging in about one-third of the time and with much smaller standard deviation between runs. A study on choosing different steps back in the GMMAE approach has also been done. Figure 9 shows the convergence rate for the MMAE approach and GMMAE approach with i = 2, 4, 6. Note that even though the theoretical derivations show that the convergence rate increases with the number of steps back taken in the GMMAE likelihood, a practical limit exists for this number. This is due to the limited number of samples chosen in the parameter space and results can vary from application to application.

0.01

0.12

0

0.1

−0.01 −0.02 −0.03 MMAE

−0.04 −0.05

GMMAE

0

10

20

30

40

50

Bias Estimate (deg/hr)

Attitude Error (deg)

20

60

0.08 0.06 0.04 0.02 GMMAE 0 −0.02

MMAE 0

10

20

Time (Min) (a) Estimates Attitude Error Fig. 6.

40

50

60

50

60

(b) Estimates of Gyro Drift Rate

Simulation of MMAE vs. GMMAE Gyro State Estimates −10

−7

x 10

x 10 1.4

1

1.2

σu (rad/sec3/2 )

σv (rad/sec1/2 )

1.2

0.8

1

0.8

0.6

0.6

0.4

0.4

0.2 0

0.2

0

10

20

30

40

50

60

0

0

Time (Min)

10

20

30

40

Time (Min)

(a) Estimates of Process noise σu Fig. 7.

30

Time (Min)

(b) Estimates of Process noise σv

Simulation of MMAE Gyro Estimates

VI. C ONCLUSION In this paper a generalized approach for multiple-model adaptive estimation was shown, which can be used for time-varying systems. The main advantage of this approach is that it provides faster convergence properties than the traditional multiple-model adaptive estimation approach. The new approach is based on using the autocorrelation of the measurement-minus-estimate residual. A formulation was developed for the Kalman filter, however computation of the off-diagonal elements of the autocorrelation matrix is intractable in practice since these matrix elements are a function of the true parameters. This difficulty was overcome by using the estimated values in the Kalman gain. It is proven that the generalized approach provides better convergence properties than the standard approach because the autocorrelation incorporated into generalized approach’s likelihood raises the sensitivity toward the parameter of an optimal filter. Simulation results indicated that the new multiple-model adaptive estimation approach can provide better convergence to the best estimate over the standard approach. Similar results have been achieved when the approach is applied to the tracking problem using range and angle measurements, which involves the use

21

−10

−7

x 10

x 10 1.4

1

1.2

σv (rad/sec1/2 )

σu (rad/sec3/2 )

1.2

0.8

1

0.8

0.6

0.6

0.4

0.4

0.2 0

0.2

0

10

20

30

40

50

0

60

0

10

Time (Min)

30

40

50

60

Time (Min)

(a) Estimates of Process Noise σu Fig. 8.

20

(b) Estimates of Process Noise σv

Simulation of GMMAE Gyro Estimates −7

x 10 1.4

1.2

σv (rad/sec1/2 )

1

0.8

0.6

0.4 0 MMAE 2 GMMAE 4 GMMAE

0.2

6 GMMAE 0

Fig. 9.

0

10

20

30

Time (Min)

40

50

60

Simulation of MMAE and Varying Steps Back GMMAE of Process Noise σv Estimates

of the extended Kalman filter for state estimation. For nonlinear model and/or measurement processes, it is assumed that the linearized output sufficiently captures the statistics of the likelihood function by making the small noise assumption, and thus the time-varying solution provided in this paper is applicable for this case. R EFERENCES [1] X. R. Li and V. P. Jilkov, “Survey of maneuvering target tracking. part v. multiple-model methods,” IEEE Transactions on Aerospace and Electronic Systems, vol. 41, no. 4, pp. 1255–1321, Oct. 2005. [2] D. T. Magill, “Optimal adaptive estimation of sampled stochastic processes,” IEEE Transactions on Automatic Control, vol. 10, no. 4, pp. 434–439, Oct. 1965. [3] H. A. P. Blom and Y. Bar-Shalom, “The interacting multiple model algorithm for systems with markovian switching coefficients,” IEEE Transactions on Automatic Control, vol. 3, no. 8, pp. 780–783, Aug. 1988. [4] X. R. Li and Y. Bar-Shalom, “Multiple-model estimation with variable structure,” IEEE Transactions on Automatic Control, vol. 41, no. 4, pp. 478–493, April 1996.

22

[5] J. R. Vasquez and P. S. Maybeck, “Enhanced motion and sizing of bank in moving-bank mmae,” IEEE Transactions on Aerospace and Electronic Systems, vol. 40, no. 3, pp. 770–779, July 2004. [6] B. J. Odelson, M. R. Rajamani, and J. B. Rawlings, “A new autocovariance least-squares method for estimating noise covariances,” Texas-Wisconsin Modeling and Control Consortium, Tech. Rep. 2003-04, Sept. 2003. [7] R. K. Mehra, “On the identification of variances and adaptive kalman filtering,” IEEE Transactions on Automatic Control, vol. 15, no. 2, pp. 175–184, April 1970. [8] P. D. Hanlon and P. S. Maybeck, “Multiple-model adaptive estimation using a residual correlation kalman filter bank,” IEEE Transactions on Aerospace and Electronic Systems, vol. AES-36, no. 2, pp. 393–406, April 2000. [9] R. G. Brown and P. Y. C. Hwang, Introduction to Random Signals and Applied Kalman Filtering, 3rd ed. New York, NY: John Wiley & Sons, 1997, pp. 353–361. [10] R. F. Stengel, Optimal Control and Estimation. New York, NY: Dover Publications, 1994, pp. 402–407. [11] B. D. O. Anderson and J. B. Moore, Optimal Filtering. Mineola, NY: Dover Publications, 2005, ch. 10.1. [12] B. D. O. Anderson, J. B. Moore, and R. M. Hawkes, “Model approximations via prediction error identification,” Automatica, vol. 14, no. 6, pp. 615–622, Nov. 1978. [13] J. L. Crassidis and Y. Cheng, “Generalized multiple-model adaptive estimation using an autocorrelation approach,” in Proceedings of the 9th International Conference on Information Fusion, Florence, Italy, July 2006, paper 223. [14] P. D. Hanlon, “Practical implementation of multiple model adpative estimation using neyman-pearson based hypothesis testing and spectral estimation tools,” Ph.D. dissertation, School of Engineering, Air Force Institute of Technology, Wright-Patterson AFB, OH, September 1996. [15] B. Alsuwaidan, “Generalized multiple-model adaptive estimation,” Ph.D. dissertation, University at Buffalo, The State University of New York, Amherst, NY, 2008. [16] P. Sprent, “Inversion of nearly diagonal matrices,” The Mathematical Gazette, pp. 184–189, May 1965. [17] J. M. Hammersley, “Monte carlo methods for solving multivariable problems,” Annals of the New York Academy of Sciences, vol. 86, pp. 844–874, 1960.