A Fast Nonlinear Model Identification Method - IEEE Xplore

Report 1 Downloads 85 Views
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO. 8, AUGUST 2005

1211

[8]

, “A general architecture for decentralized supervisory control of discrete-event systems,” Discrete Event Dyna. Syst.: Theory Appl., vol. 12, no. 3, pp. 335–377, 2002. [9] , “Decentralized supervisory control with conditional decisions—Part II: Verification and synthesis,” Univ. Michigan, Ann Arbor, MI, Tech. Rep. CGR-03-18, 2003. [10] , “Decentralized supervisory control with conditional decisions: Supervisor existence,” IEEE Trans. Autom. Control, vol. 49, no. 11, pp. 1886–1904, Nov. 2004.

A Fast Nonlinear Model Identification Method Kang Li, Jian-Xun Peng, and George W. Irwin Abstract—The identification of nonlinear dynamic systems using linear-in-the-parameters models is studied. A fast recursive algorithm (FRA) is proposed to select both the model structure and to estimate the model parameters. Unlike orthogonal least squares (OLS) method, FRA solves the least-squares problem recursively over the model order without requiring matrix decomposition. The computational complexity of both algorithms is analyzed, along with their numerical stability. The new method is shown to require much less computational effort and is also numerically more stable than OLS. Index Terms—Computational complexity, fast recursive algorithm, nonlinear system identification, numerical stability.

I. INTRODUCTION

Fig. 5. Supervisor 1: conditional decision

(

( )).

V. CONCLUSION The procedure presented to synthesize supervisors that implement conditional decisions is novel and relies on specially constructed nondeterministic automata that track violations of C&P and D&A coobservability, respectively. This realization procedure gives insight into the nature of conditional decisions, especially regarding the inferencing process that is at the heart of the conditional architecture.

REFERENCES [1] C. G. Cassandras and S. Lafortune, Introduction to Discrete Event Systems. Norwell, MA: Kluwer, 1999. [2] E. Chen and S. Lafortune, “On the infirnal closed and controllable superlanguage of a given language,” IEEE Trans. Autom. Control, vol. 35, no. 4, pp. 398–404, Apr. 1990. [3] H. Cho and S. I. Marcus, “On supremal languages of classes of sublanguages that arise in supervisor synthesis problems with partial observation,” Math. Control Signals Syst., vol. 2, pp. 47–69, 1989. [4] K. Rudie and J. C. Willems, “The computational complexity of decentralized discrete-event control problems,” IEEE Trans. Autom. Control, vol. 40, no. 7, pp. 1313–1318, Jul. 1995. [5] K. Rudie and W. M. Wonham, “Think globally, act locally: Decentralized supervisory control,” IEEE Trans. Autom. Control, vol. 37, no. 11, pp. 1692–1708, Nov. 1992. [6] W. M. Wonham, “Notes on control of discrete-event systems,” Univ. Toronto, Toronto, ON, Canada, Tech. Rep., Jul. 2003. [7] T. Yoo and S. Lafortune, “Decentralized supervisory control: A new architecture with a dynamic decision fusion rule,” in Proc. 6th Int. Workshop Discrete Event Systems, Zaragoza, Spain, 2002, pp. 11–17.

Some widely used nonlinear regression models and neural networks constitute linear-in-the-parameters models, for example the nonlinear autoregressive model with exogenous inputs (NARX) and radial basis function (RBF) networks [1]–[8]. Such models form a linear combination of model terms, or basis functions, which are nonlinear functions of the system variables. Depending on the nonlinear functions employed, linear-in-the-parameters models possess broad approximation capabilities, and have wide applications [1]–[8]. One problem with such models is that an excessive number of candidate model terms or basis functions usually have to be considered initially [2]–[8]. From these, a useful model is then generated based on the parsimonious principle [9], [10], of selecting the smallest possible model, in terms of size, which explains the data. Given a model selection criterion, this can be achieved by an exhaustive search of all possible combinations of candidates using a least-squares method. This is computationally very expensive. To reduce the computational complexity, efficient suboptimal search algorithms have been proposed, among which orthogonal least-squares (OLS) method is perhaps among the most popular [2]–[9]. OLS was first applied to nonlinear dynamic system identification [2], [3] and is now widely used in many other areas [4]–[8]. In general, OLS approaches are derived from an orthogonal (or QR) decomposition of the regression matrix [2]–[9]. The elegance of the OLS approach lies in that, the net decrease in the cost function can be Manuscript received September 18, 2003; revised January 10, 2005. Recommended by Associate Editor E. Bai. This work was supported by the U.K. EPSRC under Grant GR/S85191/01 to K. Li. The authors are with the School of Electrical and Electronic Engineering, Queen’s University of Belfast, Belfast BT9 5AH, U.K. (e-mail: [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/TAC.2005.852557

0018-9286/$20.00 © 2005 IEEE

1212

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO. 8, AUGUST 2005

explicitly formulated as each new term is selected for inclusion in the model with the model parameters then obtained using backward substitution [2]. In this way, the computational burden is significantly reduced. A number of fast orthogonal least squares algorithms have been proposed to further improve the efficiency [6]–[8]. Most fast OLS algorithms, which have been previously proposed before, are based on, or are equivalent to, conventional OLS, and can be computationally more efficient in certain cases. Nevertheless, a survey of the literature shows that though some estimates have appeared, a complete and accurate analysis of the computational complexity is lacking when OLS methods are used for the identification of nonlinear systems. Moreover, despite having been discussed [2], little has been done so far to examine the numerical stability. In this note, a fast recursive algorithm (FRA) is proposed for nonlinear dynamic system identification using linear-in-the-parameters models. Unlike OLS [2]–[9], this method solves the least-squares problem recursively over the model order without requiring matrix decomposition and transformation. A complete analysis of computational complexity for both the FRA and the OLS is presented and numerical stability of both algorithms is also examined. II. PROBLEM FORMULATION AND OLS Consider a nonlinear discrete-time dynamic system [2], [7]

y (t) = f (y(t 0 1); . . . ; y(t 0 ny ); u(t 0 1); . . . ; u(t 0 nu )) = f (x(t))

used to solve the least-squares problem. Without losing any generality, the weights are all assumed to be unity in this note. Now, (4) can be reformulated as

E = (82 0 y )T (82 0 y ):

(5)

8

is of full column rank, the least-squares estimate of If minimizes this cost function is then given by [9]

ky 0 82k2 = (8T 8)01 8T y 2^ = arg min 2

2 that (6)

88

where k1k2 denotes the Euclidean norm and T is sometimes called the information matrix. The associated minimal cost function is ^ ) = yT y 0 2 ^ T 8T y : E (2

(7)

2

2

Among numerical methods available for computing ^ and E ( ^ ), matrix decomposition methods have been widely used [8]. In particular, QR decomposition of led to the well-known OLS method [2] for modeling and identification of nonlinear dynamic systems. In conventional OLS [2]–[4], an orthogonal transformation is applied to (2) to produce

8

f ( t) = (1)

n i=1

gi i (x(t)) + "(t):

(8)

The estimated parameters in (8) are then computed as where u(t) and y (t) are system input and output variables at time instant t; nu , and ny are corresponding maximal lags, x(t) = [y (t 0 T 1); . . . ; y (t 0 ny ); u(t 0 1); . . . ; u(t 0 nu )] is model “input” vector, f ( 1 ) is some unknown nonlinear function. Suppose a linear-in-the-parameters model is used to represent (1) such that

y ( t) =

n i=1

i 'i (x(t)) + "(t)

(2)

where 'i ( 1 ); i = 1; . . . ; n are all candidate model terms, and "(t) is the model residual sequence. Remark 1: The total number (n) of candidate model terms can initially be significantly large and model (2) is then an over-fitting one for system (1). Therefore, it is important to find a parsimonious model with a much smaller number, say k (k  n) of terms for nonlinear system identification [2], [10]. Suppose N data samples fx(t); y (t)gN t=1 are used for model identification, (2) can then be formulated as (3)

where 8 = ['1 ; . . . ; 'n ]; 'i = ['i (x(1)); . . . ; 'i (x(N ))]T ; i = 1; . . . ; n; 8 2