System parameter estimation with input/output noisy data and missing ...

Report 3 Downloads 69 Views
1548

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 6, JUNE 2000

System Parameter Estimation with Input/Output Noisy Data and Missing Measurements Jeng-Ming Chen and Bor-Sen Chen, Senior Member, IEEE

Abstract—In this paper, an investigation is undertaken to examine the parameter estimation problem of linear systems when some of the measurements are unavalaible (i.e., missing data) and the probability of occurrence of missing data is unknown a priori. The system input and output data are also assumed to be corrupted by measurement noise, and the knowledge of noise distribution is unknown. Under the unknown noise distribution and missing measurements, a consistent parameter estimation norm iterative estimation algortihm [which is based on an algorithm-iteratively reweighted least squares (IRLS)] is proposed to estimate the system parameters. We show that if the probability of missing measurement is less than one half, the parameter estimates via the proposed estimation algorithm will converge to the true parameters as the number of data tends to infinity. Finally, several simulation results are presented to illustrate the performance of the proposed norm iterative estimation algorithm. Simulation results indicate that under input/output missing data and noise environment, the proposed parameter estimation algorithm is an efficient approach toward the system parameter estimation problem. Index Terms—Input/output noisy data, -norm iterative algorithm, missing measurement, parameter estimation.

I. INTRODUCTION

E

STIMATION of the system parameters, given noisy input/output data, is a major field in control and signal processing. Many different estimation methods have been proposed in recent years; see, e.g., [7], [10], and [20]. In most cases, however, it is assumed that the measurements always contain the signal. In fact, in practical situations, there may be a nonzero probability that any observation consists of noise alone, i.e., the measurements are not consecutive but contain missing observations. The missing observations are caused by a variety of reasons, e.g., a certain failure in the measurement, intermittent sensor failures, accidental loss of some collected data, or some of the data may be jammed or coming from a high noise environment, etc., [1]. Fundamentally, the standard definition of covariance in the statistical analysis of data does not directly apply if some of the measurements are unavailable [1], [2]. Thus, many currently used parameter estimation algorithms do not apply to this situation. This paper is concerned Manuscript received December 15, 1998; revised November 8, 1999. This work was supported by the National Science Council of Taiwan under Contract NSC 87-2218-E-007-032. The associate editor coordinating the review of this paper and approving it for publication was Prof. José R. Casar. J.-M. Chen is with the Department of Electrical Engineering, St. John’s and St. Mary’s Institute of Technology, Tamsui, Taiwan, R.O.C. (e-mail: [email protected]). B.-S. Chen is with the Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan, R.O.C. (e-mail: [email protected]). Publisher Item Identifier S 1053-587X(00)04058-7.

with the problem of estimation of system parameters when the system input/output data are corrupted by the measurement noises and some input/output data are missed. The problem of parameter estimation for measurements with missing values is more difficult than the case with consecutive measurements. Only a limited number of estimation methods for system output signal with missing measurements have been developed; see, e.g., [1], [2], and [17]. Some early works on the parameter estimation problem under missing observations are discussed in [1], and [2]. However, these studies are based on the assumption that the observed system input/output data are not all corrupted by measurement noise. As is well known, in practical industrial situations, the observed input/output data of an identified system are typically corrupted by measurement noise, and this phenomenon will have a great influence on the accuracy of estimation of system parameters. A number of estimation methods have been proposed for the problem of system parameter estimation, e.g., the maximum likelihood method (MLE), the least squares method (LS), and the prediction error method (PEM) [14]. Among them, maximum likelihood methods frequently achieve the best possible asymptotic efficiency, i.e., their performance relative to the Cramér–Rao lower bound is asymptotically optimal. Nevertheless, maximum likelihood methods are also known to be relatively nonrobust [6]. Deviation of the data from the ideal model, e.g., measurement noise with unknown distribution, missing data, outliers, and temporary deviation from stationary, often leads to a local maximum that may be far from the “true” global maximum. Developing an alternative estimation method, which is robust against both unknown noise distribution and data missing, becomes necessary since the distribution of measurement noise in this paper is assumed to be unknown, and some measurements of system input/output data are missed. It is well known that a least sum of absolute deviation error criterion ( norm minimization) may be useful in generating a robust parametric estimation algorithm if the data are contaminated by impulsive noise [4]. However, -norm minimization may result in an infinity of solutions, i.e., the solution is not al-norm solution ways unique [11]. The robustness of an provides alternative criterion for system parameter estimation when the residual distribution is non-Gaussian. In addition, a robust parameter estimation for ARMA system with output noise via the -norm approach has been discussed in [18]. In general, the -norm solution may be obtained in a variety of ways. Moreover, it has been shown that -norm estimators are the maximum likelihood estimators when the probability density function of residual is the generalized Gaussian [16]. Thus, if the distribution of residual is unknown, we can find a value

1053–587X/00$10.00 © 2000 IEEE

CHEN AND CHEN: SYSTEM PARAMETER ESTIMATION WITH INPUT/OUTPUT NOISY DATA AND MISSING MEASUREMENTS

for the -norm estimator to approach a corresponding maximum likelihood estimation; in addition, its results have been demonstrated to be better than other linear parameter estimation methods [8]. Since the measurements corresponding to some time instants are missing, the residual (predicted error) will be composed of noise and missing data. In such cases, the true distribution of the residual would actually be unknown, and conventional estimation algorithms are not adequate to estimate the system parameters. Therefore, the -norm estimation approach is adopted in this paper. Although several methods may be used to generate -norm solutions, only the iteratively reweighted least squared (IRLS) algorithm is of concern. The main reason for using the -norm solutions IRLS algorithm to compute is that it is easy to implement, and its convergence is guaranteed. In this paper, a consistent parameter estimation algorithm is proposed on the basis of IRLS estimation algorithm for the parameter estimation of a system with unknown noise distribution and missing measurements. Moreover, a rule for selecting the most appropriate value of for any given residual distribution is introduced. Finally, we also show that if the probability of occurrence of a missing measurement is less than one half, the parameter estimates via the proposed estimation algorithm will converge to the true parameters as the number of data tends to infinity. The paper is organized as follows. Some basic assumptions and a detailed statement of the problem are given in Section II. norm estimation method, i.e., IRLS algoAn rithm, is discussed in Section III. How to select adequate exponent from the sample kurtosis is also introduced. The parameter estimation algorithm is established in Sections IV and V. The convergence analysis of the proposed method is performed in Section VI. Section VII presents some numerical illustrations. Concluding remarks are finally offered in Section VIII. II. STATEMENT OF THE PROBLEM Let the single-input single-output discrete-time system be described by the following model: (1) where

,

, and is the backward shift operator, i.e., . . The sequences Without loss of generality, let and are the system input and output signals, respectively. and may We assume that at random instants, the signals be absent from the measurements. Let the sequences and be defined as (2) where

and

are binary random variables such that if is measured if is missing if is measured (3) if is missing.

1549

Thus, and can be regarded as the measurements and with missing data, respectively. and are also assumed to be corrupted Moreover, and , respecby zero mean measurement noise and denote the observed tively. Let the sequences system input and output signals, respectively. Thus, the measurement equations are given by of

(4) Next, the vector of the unknown true parameters in the system model (1) is defined as (5) As a result, (1) can be written as (6) where (7) The following assumptions are made. and are zero mean white noise seA1) quences with the same probability distribution, differing only in their variances

where denotes the expectation operator. Furthermore, they are mutually independent. and are indepenA2) The noise sequences . dent of is a stationary erA3) The driving input sequence godic random sequence. and of the identified system are A4) The orders known, and all the zeros of the polynomial lie strictly inside the unit circle. and are assumed to be A5) The sequences . asymptotically stationary and independent of Furthermore, they are mutually independent. and to be The probability for any measurements observed is assumed to be unknown. Moreover, the system input and output have the same probability of missing measurements, i.e., (8) where is a fixed probability, independent of time. Thus, the probability of missing measurements (input or output) is . The problem to be considered involves estimating the system parameter from the available data . Remark 1: The pattern of missing data can generally be quite arbitrary. However, the following two patterns are of special interest [1], and it will be considered in the simulation examples: 1) the random Bernoulli pattern, in which each measurement has a fixed probability of being missed and, for different instants, the

1550

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 6, JUNE 2000

occurrence of missing data are mutually independent and 2) the deterministic periodic pattern, in which the pattern of missing time points is repeated periodically. data in a set of Now, by (6), we have

III.

-NORM APPROACH

A. IRLS Algorithm Define the residual

as

(9) where is an estimate of . The problem of -norm approximation consists of finding a vector such that the -norm error function

Define

Thus, by the assumptions and (4), we have

(17)

if if

(10)

if if

(11)

is minimized for values of We set

satisfying

.

(18)

(12)

in order to find an -norm solution to (17). Computing (18) results in (19)

Next, the vector of observed variables is defined as

(13)

Define (20)

Thus, by (10)–(12), we have Equation (19) becomes (14) where Thus, the following equation is given immediately:

diag

(21) are identity matrices of order , From (9) and (14), we have

, respectively.

(15) Therefore, if the matrix timate of is given by

is known and the least squares es-

(16) in (16) is a consistent estimate of true then the estimate parameter vector . In general, the least squares estimator is widely used and offers computational and theoretical advantages. However, the matrix is assumed to be unknown in this paper. Thus, developing an alternative estimation method is necessary. In this -norm estimators are proposed to treat the paper, parameter estimation problem under missing data and without information regarding the noise distribution. In the next sec-norm estimation algorithm-IRLS is tion, an reviewed.

is a function of the residual, (21) is nonlinear and Since usually must be solved via iterative methods such as IRLS. In fact, for -norm minimization, the idea of IRLS algorithm is based on the weighted least squares (WLS) technique. Additionally, each iteration of IRLS estimation algorithm solves a new -norm problem by employing the weighted residuals of the previous iteration in the current one, i.e., for the th iteration, the IRLS algorithm solves the -norm problem represented in (17) by

(22) where (23) and is the iteration index.

CHEN AND CHEN: SYSTEM PARAMETER ESTIMATION WITH INPUT/OUTPUT NOISY DATA AND MISSING MEASUREMENTS

Without further modifications, the above algorithm cannot be . To proven to converge to the correct solution for ameliorate, Huber [6] suggested replacing (23) by if if

of

Find the sample kurtosis [15]

(24)

where is a small positive number, i.e., if the weighting funcin (24) is selected for (22), the IRLS algorithm is tion convergent [6]. , (23) produces Remark 2: We recognize that for when the residual is close to zero. a large value of Thus, in such case, the IRLS algorithm does not meet its convergence condition. With the modification in (24), the weights must be bounded for all [11]. In this situation, the convergence of IRLS algorithm can be guaranteed. The IRLS algorithm is applicable to a variety of estimation problems. However, under unknown input/output noise distribution and missing data environment (i.e., the matrix is unknown), the IRLS algorithm is inconsistent. Therefore, a consistent -norm estimation method is developed in Section III-B via an adequate modification of the IRLS algorithm. B. The Choice of the Exponent

1551

–Norm

As is well known, the optimal value of the -norm estimation algorithm depends critically on the distribution of residuals. Searching for an adequate value of in each iteration of the IRLS estimation algorithm is necessary since a priori information of residual distribution is unknown in this study. Thus, how to determine an adequate exponent of the -norm via sample data is reviewed next. For many years, studies have demonstrated that the “best” depends on the kurtosis of noise distribution. Furthermore, they indicate that a longer tailed distribution has a larger kurtosis, i.e., a smaller “optimal” [5], [8], , the au[15]. In the paper by Money et al. [8], for thors recommended that an optimal value of the exponent of the -norm can be expressed as a function of kurtosis , i.e.,

where is the sample size. Next, the exponent in (23) is selected as (26) If the adequate value of is determined by (26), the IRLS algorithm in (22) may be expected to be an efficient parameter estimation algorithm in the -norm sense. Under the unknown noise distribution and missing data case, however, the IRLS algorithm is inconsistent. Thus, a consistent -norm iterative estimation algorithm is developed in Section IV on the basis of the principle of bias compensation. IV. CONSISTENT -NORM ITERATIVE ESTIMATION ALGORITHM Note that the form of IRLS estimation algorithm (22) is a weighted least squares (WLS) method. Based on the consistent least squares algorithm in (16), we know that the IRLS estimation algorithm (22) is always inconsistent. Thus, a consistent norm iterative estimation algorithm is developed in the following. First, by (6), we know (27) is a weighting function, and is defined as (20). where Moreover, by (4), we have

if if (28)

(25) However, it is stressed that the investigation of Money et al. is based on the known residual distribution. In other words, a priori information of kurtosis is necessary before applying the formula (25) to the choice of . In many practical situations, the kurtosis of residual distribution is unknown, and it must be estimated from sample data. Therefore, in such cases, Money et al. suggest that the true kurtosis can be replaced in the formula (25) by a sample estimate of the true kurtosis to obtain an estimated value of , which can be used to determine the -norm estimates of the regression coefficients. Since the true residual distribution is unknown, the kurtosis of residual distribution must be estimated, i.e., the value of must be estimated before the IRLS estimation algorithm is used. The following method is recommended to approach the value of exponent with the help of Money’s formula.

if if (29) and (30) Similarly, by (28)–(30), we have

(31) where diag (32)

1552

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 6, JUNE 2000

and

Moreover, by (15), we have (37) Thus,

can be obtained by solving

Thus, by (27) and (31), we get

(38) (33)

and is a constant to be determined. In fact, where is a bias compensation for the IRLS algorithm (22) to achieve a consistent estimate. Remark 3: Note that if the occurrence of missing data are mutually dependent, (28)–(30) do not hold. Thus, (33) does also not hold. Therefore, the proposed -norm parameter estimation algorithm will not be a consistent estimator. A simulation study of correlated pattern of data missing is included in the example section (see Table XIII). Now, (33) is rewritten as

and

can be obtained by solving (see Appendix A)

(39) where

(34) i.e.,

(40) (35)

where

Next, a consistent -norm iterative parameter estimation aland in gorithm is developed. The theoretical values of (35) cannot be obtained directly, and they should be replaced by sample averaging. Note that the matrix and the constant are also unknown. Thus, based on (35), a consistent iterative -norm parameter estimation scheme is proposed as

denotes the th element of . and -norm itTherefore, based on (36)–(39), an erative parameter estimation algorithm is proposed for the parameter estimation of system with noisy input/output data and missing measurements. The proposed -norm parameter estimation algorithm is presented in Appendix B, where denotes the iteration step. Now, the following main parameter estimation procedure is proposed, where denotes the iteration step as well as the size of the data set. V. MAIN PARAMETER ESTIMATION ALGORITHM Step 1) Given a small positive number . Let and be the usual least squares estimate of . Use the IRLS algorithm in (22) to get parameter estimate . Set . Step 2) Calculate the residuals

i.e.,

for Step 3) Update the exponent tion: (36)

where

and the value of weighting function

via the following equaif if

is defined as (24).

where is defined in (26). (see ApStep 4) Apply the -norm algorithm to find pendix B). . Go to Step 2until appears to Step 5) Let have converged. Remark 4: In the initial parameter estimate, the residual dis. Moreover, tribution is assumed to be Gaussian, i.e., let we know that the solution is not unique to avoiding the

CHEN AND CHEN: SYSTEM PARAMETER ESTIMATION WITH INPUT/OUTPUT NOISY DATA AND MISSING MEASUREMENTS

case, and the exponent as in Step 3.

in the estimation algorithm is updated

Furthermore, if have

1553

(i.e.,

), then we

(46)

VI. CONVERGENCE ANALYSIS The convergence property of the proposed parameter estimation algorithm is analyzed in this section. We first state a convergence theorem for an iterative sequence that will be employed to treat the convergence problem of the proposed -norm iterative parameter estimation algorithm. Theorem 1 [9]: Consider the equation (41)

, the result of (44) can be obtained (i.e., Therefore, if is the absolute value of all of the eigenvalues of matrix less than unity). Thus, the following theorem is obtained. Theorem 2: Consider the system given by (1) and (4). Let the matrices and be defined as (35) and (32), respectively. , the absolute value of all of the eigenvalues of If is less than unity, i.e.,

is an -by- matrix, and , are -by-1 vectors. Let be an -dimensional vector sequence obtained from the following sequential algorithm:

where

for

(42)

and any vector , the vector seFor any initial vector converges to if and only if the following conquence dition is satisfied:

where denotes an eigenvalue of . Next, the convergence analysis of the proposed parameter estimation algorithm is carried out as follows. Before further discussion regarding the convergent property of the proposed -norm iterative parameter estimation algorithm, the convergence of the following iteration estimation scheme is discussed first: (43) . where By Theorem 1 and (35), we know that the sequence obtained from (43) converges to if . Therefore, in the next step, we will prove that the absolute is less than unity for value of all of the eigenvalues of . , and let be the Let be an arbitrary eigenvalue of , i.e., , associated eigenvector. Then, . Moreover, we know that the component and of matrix is the covariance of residual, and the component of matrix is composed of the covariances of measured signal and noise. Therefore, if the power of measured signal is larger than the power of residual, we can prove the following result:

where denotes an eigenvalue of . Proof: From the above analysis, the result can be obtained directly. Since we assumed , by Theorem 1 and Theorem 2, obtained from (43) conwe know that the sequence verges to . It is noted that (36) and (43) are of similar form. In our proposed method (36), however, both the system parameters and matrices , are estimated simultaneously. In fact, we know is given as (24), the IRLS that if the weighting function algorithm is convergent [6]. Hence, the major distinction bein (36) does tween (36) and (43) is that the matrix not remain constant in the iterative process and must be updated continuously. , we have Note that for

and

(47) Thus, for

, we get

(44) (48) Since to obtain

, by the definition of

and (28)–(30), it is easy where (45)

where Moreover, based on (47) and (A.2), it is obvious that for

1554

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 6, JUNE 2000

where

TABLE I ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR TYPE (i) NOISE = 10 dB, = 500) DISTRIBUTION IN EXAMPLE 1 (

SNR

diag

N

and

TABLE II ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR TYPE (ii) NOISE DISTRIBUTION IN EXAMPLE 1 ( = 10 dB, = 500)

SNR

Thus, for

N

, we also have (49)

(or , ) Note that the major distinction between and (or , ) is that it takes different weightings. Thus, for , we have

(50) Therefore, we have the following theorem. Theorem 3: Consider the system given by (1) and (4). For and , if the probability of missing measureis less than , the parameter estimate in ments (36) will converge to the true parameter . Proof: Based on (48) and (49), it is obvious that for , (36) becomes

(51) Furthermore, by (50), we get

Suppose that value of all the eigenvalues of Theorem 1, we obtain

. By Theorem 2, the absolute is less than unity. Thus, by (52)

as

and . By comparing (35) with (52), we conclude that if , in (36) will converge to the true parameter as and . VII. SIMULATION RESULTS

The performance of the proposed parameter estimation algorithm is provided with two illustrative examples in this section. Example 1: The true system model is chosen as

The system input and output signals are corrupted by the measurement noise. The following three different types of noise distribution are considered. i) zero mean white Gaussian noise; ii) zero mean white contaminated Gaussian noise with the distribution of form iii) white noise uniformly distributed in the interval . We assume that the system input and output have the same signal-to-noise ratio, and we also assume that the measurement and have the same distribution. The following noises three data missing patterns are considered in this example. Pattern 1: Data is missed on a regular basis, where for each , successive measurements period of length successive measurements are missed. are available, and This is the pattern discussed by Jones [2]. Pattern 2: Data missed is random using the Bernoulli mod. ulating given in (3) with Pattern 3: The missing data missing is one with the missed successive measurements similar to Pattern 1 (i.e., measurements are missed). However, the occurrence of missing measurements is random and mutually independent. The . probability of missing measurements is is white Gaussian with The input driving sequence zero mean and unit variance. Monte Carlo simulations were per, each formed by generating 50 independent sequences of 500, 400, 200, and 100 samples. with length The proposed parameter estimation algorithm is used to estimate the system parameters for all the three types of noise distributions and three data missing patterns. Those results are summarized in Tables I–XII, which show the averaged solution and standard deviation (Std. Dev.). The value of is chosen as . In general, for different types of noise distributions and different patterns of missing data, the system parameters can

CHEN AND CHEN: SYSTEM PARAMETER ESTIMATION WITH INPUT/OUTPUT NOISY DATA AND MISSING MEASUREMENTS

1555

TABLE III ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR TYPE (i) NOISE = 20 dB, = 500) DISTRIBUTION IN EXAMPLE 1 (

TABLE VII ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 2 IN EXAMPLE 1 ( = 20 dB, = 200)

TABLE IV ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR TYPE (ii) NOISE DISTRIBUTION IN EXAMPLE 1 ( = 20 dB, = 500)

TABLE VIII ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 1 IN EXAMPLE 1 ( = 20 dB, = 100)

TABLE V ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR TYPE (iii) NOISE DISTRIBUTION IN EXAMPLE 1 ( = 20 dB, = 500)

TABLE IX ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 2 IN EXAMPLE 1 ( = 20 dB, = 100)

TABLE VI ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 1 IN EXAMPLE 1 ( = 20 dB, = 200)

TABLE X ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 3 IN EXAMPLE 1 ( = 20 dB, = 400)

be estimated accurately by the proposed parameter estimation algorithm. Furthermore, the simulation results, which are presented in Tables III–XII, have shown that the large sample size cases gives more efficient (less variance and less biased) estimates than the small sample size cases. In addition, the simulation results have also shown that the Type i) and Type ii) noise cases give more efficient (less variance) estimates than the type iii) noise case. The simulation results of system parameter estimation via -norm estimation algorithm are also given in Tables XIV and XV. In fact, we know that the -norm method is nonrobust. Hence, it is expected that the accuracy of the estimates by using the proposed -norm estimation algorithm is superior to the -norm estimation algorithm. Moreover, the effect of correlated pattern of missing data [i.e., for

different instants, the occurrence of missing data are dependent ] is also studied, and random variables with distribution the simulation results are presented in Table XIII. However, for the correlated pattern, the simulation results revealed that the proposed -norm parameter estimation algorithm is not a consistent parameter estimator. The convergence aspects of the estimated , , , , , and by the proposed -norm estimation algorithm are plotted in Figs. 1–3. It is clear from the plots that the proposed -norm estimation algorithm produces an accurate estimate of system parameters. Thus, simulation results confirm that the proposed -norm parameter estimation algorithm is an efficient approach to the parameter estimation problem with missing data and unknown input/output measurement noise distribution.

SNR

N

SNR

N

SNR

SNR

N

N

SNR

SNR

SNR

SNR

N

N

N

N

1556

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 6, JUNE 2000

TABLE XI ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 3 IN = 20 dB, = 200) EXAMPLE 1 (

SNR

N

TABLE XV ESTIMATED PARAMETER (MEAN AND STD. DEV.) VIA NORM METHOD FOR TYPE (ii) NOISE DISTRIBUTION IN EXAMPLE 1 ( = 20 dB, = 500)

l SNR

N

TABLE XII ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR MISSING PATTERN 3 IN EXAMPLE 1 ( = 20 dB, = 100)

SNR

N

TABLE XIII ESTIMATED PARAMETER (MEAN AND STD. DEV.) FOR CORRELATED PATTERN IN EXAMPLE 1 ( = 20 dB, = 400)

SNR

N

Fig. 1.

20 dB).

a^

a

(“—”) and ^ (“- -”) parameter estimates of Example 1 (

SNR =

TABLE XIV ESTIMATED PARAMETER (MEAN AND STD. DEV.) VIA NORM METHOD FOR TYPE (i) NOISE DISTRIBUTION IN EXAMPLE 1 ( = 20 dB, = 500)

l SNR

N

Example 2: An LPC model of a short-term speech signal is given by

The sequence is white Gaussian with zero mean and unit variance. Monte Carlo simulations were performed by gen, each samerating 50 independent sequences of ples in length. The following pattern of data missed is considered in this example.

b

b

b

Fig. 2. ^ (“—”), ^ (“- -”), and ^ (“- 1 -”) parameter estimates of Example 1 ( = 20 dB).

SNR

Pattern: Data is missed on a regular basis, where for each , successive measurements period of length successive measurements are missed. are available, and The proposed parameter estimation algorithm is used to estimate the system parameters. The results are summarized in Table XVI. Table XVI includes the results of interpolation method (i.e., the missing data is linearly interpolated). In general, the poles of LPC model of speech signal lie closely inside the unit circle. Thus, in the iteration process, some poles of the estimated model may lie outside the unit circle, and the convergence of the algorithm to the optimal solution is not

CHEN AND CHEN: SYSTEM PARAMETER ESTIMATION WITH INPUT/OUTPUT NOISY DATA AND MISSING MEASUREMENTS

Fig. 3.

p^ parameter estimate of Example 1 (SNR = 20 dB).

TABLE XVI ESTIMATED PARAMETER (MEAN AND STD. DEV.) METHOD IN EXAMPLE 2

VIA

l

1557

version of IRLS algorithm. Our results indicated that the proposed -norm estimation algorithm would converge if the probability of occurrences of missing measurements is less than one half, and the amount of data tends to infinity. For three different types of noise distributions and three different patterns of missing data, simulation results illustrated that the unknown parameters can be estimated accurately. Moreover, the simulation results have shown that the large sample size cases give more efficient estimates than the small sample sizes cases. In addition, the simulation results have also shown that the Gaussian and uniform noise distribution cases give more efficient estimates than the -contaminated Gaussian noise-distribution case. Thus, under input/output missing data and noisy environment, although a priori information of noise distribution is unknown and a priori probability of occurrence of missing data is also unknown, the proposed consistent parameter estimation algorithm is still an efficient approach to the parameter estimation problem. APPENDIX A

NORM

In this Appendix, we will derive (39). Recalling (1) (53) Let

TABLE XVII ESTIMATED PARAMETERS (MEAN AND STD. DEV.) METHOD IN EXAMPLE 2

Then, (53) implies that the matrix gular [14]. Denote VIA

is sin-

RLPC

diag diag

always guaranteed. Therefore, this phenomenon will affect the accuracy of the estimation of the system parameters. Thus, how to improve the accuracy of this kind of parameter estimation is worthy of further study. Moreover, the simulation results of system parameter estimation via a robust LPC parameter estimation method (RLPC) [19] are also given in Table XVII. Based on the simulation results in Tables XVI and XVII, we can conclude that the proposed -norm estimation algorithm is superior to the robust LPC algorithm. VIII. CONCLUSION The problem of system parameter estimation with missing data was studied in this paper. Moreover, the system input and output signals are assumed to be corrupted by measurements noise, and a priori information of noise distribution is unknown. A consistent -norm parameter estimation algorithm was proposed to estimate the system parameters, which is a modified

Thus, by (28)–(30), the matrix also singular. . If we define the vector Note that , then

is by

i.e., (54) By (54), we get where Thus, the value

. can be estimated by

1558

where

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 6, JUNE 2000

,

,

are defined in (40).

APPENDIX B -NORM ITERATIVE ESTIMATION ALGORITHM Step 1) With the small positive number and the value of , set

Step 2) Find the matrix (note that the matrix

via the following equation is a diagonal matrix)

Step 3) Calculate the estimates

Step 4) Find the constant

,

,

by

via

Step 5) Calculate the estimates

Step 6) Calculate the estimate

and

by

by

[4] J. E. Gentle, “Least absolute values estimation: An introduction,” Commun. Stat.-Sim. Comput., vol. 6, pp. 313–328, 1977. [5] R. Gonin and A. H. Money, “Nonlinear l -norm estimation—Part II: The asymptotic distribution of exponent, p, as a function of the sample kurtosis,” Commun. Stat.-Theory Meth., vol. 14, pp. 841–849, 1985. [6] P. J. Huber, “Robust statistics: A review,” Ann. Math. Stat., vol. 43, no. 4, pp. 1041–1067, 1972. [7] K. V. Fernando and H. Nicholson, “Identification of linear systems with input and output noise: The Koopmans–Levin method,” Proc. Inst. Elect. Eng. D, vol. 132, pp. 30–36, 1985. [8] A. H. Money, J. F. Affleck-Graves, M. L. Hart, and G. D. I. Barr, “The linear regression model: l norm estimation and choice of p,” Commun. Stat.-Sim. Comput., vol. 11, pp. 89–109, 1982. [9] J. M. Ortega and W. C. Rheinbolat, Iterative Solution of Nonlinear Equations in Several Variables. New York: Academic, 1970. [10] T. Soderstrom, “Identification of stochastic linear systems in presence of input noise,” Automatica, vol. 17, pp. 713–725, 1980. [11] R. Yarlagadda, J. B. Bedner, and T. L. Watt, “Fast algorithms for l deconvolution,” IEEE Trans. Acoust. Speech Signal Processing, vol. ASSP-33, pp. 174–181, 1985. [12] G. G. Roussas, A First Course in Mathematical Statistics. Reading, MA: Addison-Wesley, 1973. [13] H. Schneeweiss, “Consistent estimation of a regression with errors in the variables,” Metrika, vol. 23, pp. 101–115, 1976. [14] T. Soderstrom and P. Stoica, System Identification. London, U.K.: Prentice-Hall, 1989. [15] V. A. Sposito and M. L. Hand, “The efficiency of using the sample kurtosis in selecting optimal l estimators,” Commun. Stat.-Sim., Comput., vol. 12, pp. 2511–2524, 1983. [16] T. T. Pham and R. J. P. deFigueiredo, “Maximum likelihood estimation of a class of non-Gaussian densities with application to l deconvolution,” IEEE Trans. Acoust. Speech Signal Processing, vol. 37, pp. 73–82, 1989. [17] M. Tanaka and T. Katayama, “Robust identification and smoothing for linear system with outliers and missing data,” in Proc. 11th IFAC World Cong., Tallinn, Estonia, 1990, pp. 117–122. [18] B. S. Chen, J. M. Chen, and S. C. Shern, “An ARMA robust system identification using a generalized l norm estimation algorithm,” IEEE Trans. Signal Processing, vol. 42, pp. 1063–1073, 1994. [19] M. Milosavljevic, M. Markovic, B. Kovacevic, and M. Veinovic, “Robust LPC parameter estimation in standard CELP 4800b/s speech coder,” in Proc. IEEE TENCON-Digital Signal Process. Appl., 1996, pp. 200–203. [20] J. M. Chen, B. S. Chen, and W. S. Chang, “Parameter estimation of linear systems with input-output noisy data: A generalized l norm approach,” Signal Process., vol. 37, pp. 345–356, 1994.

Jeng-Ming Chen was born in Taiwan, R.O.C. He received the B.S. degree in electrical engineering from the Tamkang University, Tamsui, Taiwan, and the M.S. degree in materials science and engineering and the Ph.D. degree in electrical engineering from the National Tsing-Hua University, Hsinchu, Taiwan. He is now an Associate Professor of electrical engineering at the St. John’s and St. Mary’s Institute of Technology, Tamsui. His current research interests include parameter estimation, signal processing, and estimation theory application.

ACKNOWLEDGMENT The authors would like to thank the reviewers and the associate editor for their helpful suggestions and improvements given in the presentation of this paper. REFERENCES [1] Y. Rosen and B. Porat, “Optimal ARMA parameter estimation based on the sample covariances for data with missing observations,” IEEE Trans. Inform. Theory, vol. 35, pp. 342–349, 1989. [2] R. H. Jones, “Maximum likelihood fitting of ARMA models to time series with missing observations,” Technometrics, vol. 22, pp. 389–395, 1980. [3] W. A. Fuller, “Properties of some estimators for the errors-in-variables model,” Ann. Statist., vol. 8, pp. 407–422, 1980.

Bor-Sen Chen (M’82–SM’89) received the B.S. degree from the Tatung Institute of Technology, Tatung, Taiwan, R.O.C., in 1970, the M.S. degree from the National Central University, Taipei, Taiwan, in 1973, and the Ph.D degree from the University of Southern California, Los Angeles, in 1982. He was a Lecturer, Associate Professor, and Professor at Tatung Institute of Technology from 1973 to 1987. He is now a Professor at the National Tsing-Hua University, Hsinchu, Taiwan. His current research interests include control and signal processing. Dr. Chen has received the Distinguished Research Award from National Science Council of Taiwan four times. He is a Research Fellow of the National Science Council and the Chair of the Outstanding Scholarship Foundation.