Appl Intell (2010) 32: 173–192 DOI 10.1007/s10489-010-0211-x
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models ´ Maciej Ławrynczuk
Published online: 21 February 2010 © Springer Science+Business Media, LLC 2010
Abstract This paper describes a computationally efficient nonlinear Model Predictive Control (MPC) algorithm in which the neural Hammerstein model is used. The MultipleInput Multiple-Output (MIMO) dynamic model contains a neural steady-state nonlinear part in series with a linear dynamic part. The model is linearized on-line, as a result the MPC algorithm requires solving a quadratic programming problem, the necessity of nonlinear optimization is avoided. A neutralization process is considered to discuss properties of neural Hammerstein models and to show advantages of the described MPC algorithm. In practice, the algorithm gives control performance similar to that obtained in nonlinear MPC, which hinges on non-convex optimization. Keywords Process control · Model predictive control · Hammerstein systems · Neural networks
1 Introduction Model Predictive Control (MPC) algorithms based on linear models have been successfully used for years in advanced industrial applications [34, 41, 43, 46]. It is mainly because they can take into account constraints imposed on both process inputs (manipulated variables) and outputs (controlled variables). Constraints usually affect quality, economic efficiency and safety. Moreover, MPC techniques are
This work was supported by Polish national budget funds for science for years 2009–2011 in the framework of a research project. M. Ławry´nczuk () Institute of Control and Computation Engineering, Faculty of Electronics and Information Technology, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, Poland e-mail:
[email protected] very efficient in case of multivariable processes which have many manipulated and many controlled variables (MultipleInput Multiple-Output (MIMO) processes). Since properties of many processes are nonlinear, different nonlinear MPC approaches have been developed [19, 46]. Because neural networks [18, 42] can be efficiently used for modeling of numerous technological processes [23], they can be used in various MPC algorithms [4, 23, 32, 38, 40, 46, 47]. The class of block-oriented nonlinear models includes model structures which are composed of nonlinear steadystate (static) parts and linear dynamic parts [25]. These parts are connected in series. Hammerstein and Wiener models are the most known and the most frequently used types of block-oriented nonlinear models. For identification of Hammerstein models different methods have been proposed: correlation methods [6], linear optimization methods [8, 36], parametric methods [12], non-parametric regression methods [15] and nonlinear optimization methods [25, 48]. Hammerstein structures can be efficiently used for modeling various technological processes. For example, modeling of the loading process in diesel engines is considered in [3], distillation column modeling is discussed in [12, 13, 28], modeling of a heat exchanger is described in [12], modeling of neutralization processes (pH chemical reactors) in [13, 27, 39, 45]. An excellent review of identification algorithms and applications of Hammerstein models in process modeling, control and fault diagnosis can be found in [25]. In the simplest case polynomials are used in the steadystate nonlinear part of Hammerstein models [12, 25, 36]. Identification of polynomial Hammerstein models is not difficult. Their parameters can be calculated very efficiently by means of the least squares method (off-line) or by means of the recursive version of this method (on-line). Furthermore, structure selection can be accomplished easily by the orthogonal least squares algorithm. Unfortunately, polynomial
174
Hammerstein models have also some important disadvantages. First of all, although any continuous function defined on an interval can be approximated arbitrarily closely by a polynomial (the approximation theorem of Weierstrass), some nonlinear functions need polynomials of a high order. It is particularly important in case of multivariable processes because polynomials have many parameters, obtained models may be very complex. Hence, the model uncertainty is likely to be increased and the identification problem can be ill-conditioned. Moreover, polynomial Hammerstein models may have oscillatory interpolation and extrapolation properties. All things considered, the application of polynomial Hammerstein models is in practice limited, they are recommended only in some specific cases. Bearing in mind all aforementioned disadvantages of polynomial Hammerstein model, it is obvious that different structures have been used as the steady-state part of the model, for example polynomial splines [10], fuzzy systems [1], Support Vector Machines (SVM) [14] and wavelets [45]. An interesting alternative is to use neural networks [2, 25]. Such an approach has a few advantages. Neural networks are universal approximators [21], which means that a multilayer perceptron with at least one hidden layer can approximate any smooth function to an arbitrary degree of accuracy. Unlike polynomial approximations, neural approximations are very smooth, they do not suffer from oscillatory interpolation and extrapolation behavior. Moreover, neural networks have a relatively a small number of parameters in comparison with other structures (e.g. polynomials) and a simple structure. It is particularly important for MIMO processes. The computational complexity of identification algorithms of neural Hammerstein models is relatively low [2, 25]. This paper details a computationally efficient nonlinear MPC algorithm with neural Hammerstein models, a preliminary description of this algorithm is given in [31]. Two types of multivariable models are discussed. Although Hammerstein models are frequently used for control [1, 7, 13, 17, 24, 26, 37, 39], an inverse model is usually necessary. In the described algorithm the neural Hammerstein model is used on-line to find its local linearization and a nonlinear free trajectory. Thanks to linearization, the future control policy can be calculated on-line from an easy to solve quadratic programming problem. It is shown that in practice the algorithm gives closed-loop performance comparable to that obtained in fully-fledged nonlinear MPC, in which nonlinear optimization is repeated at each sampling instant. To show advantages of neural Hammerstein models and efficiency of the discussed MPC algorithm, a multivariable neutralization process (pH reactor) with two manipulated and two controlled variables is considered. Control of pH is of crucial importance in many chemical and biochemical processes. The process exhibits severe nonlinear behavior
M. Ławry´nczuk
and therefore cannot be adequately controlled by means of the classical PI controller or simple MPC algorithms based on constant linear models [35]. Hence, the neutralization reactor is a standard benchmark used for comparing different model structures and nonlinear control strategies, e.g. [9, 11, 16, 20, 22, 27, 29, 33, 35, 39] and references therein. A simplified version of the process (with one input and one output) is usually considered. The multivariable process is researched less frequently [27, 29, 33, 35, 39]. Different approaches to pH control are used. In the simplest case the PID controller whose parameters are updated on-line can be used [9]. Alternatively, an augmented Internal Model Control (IMC) strategy for nonlinear systems can be adopted [22]. Adaptive nonlinear control systems applied to a pH reactor are described in [16, 20]. Usually, different MPC algorithms have been used for the neutralization process. A fuzzy Dynamic Matrix Control (DMC) algorithm is discussed in [35]. An important advantage of the classical linear DMC approach is the fact that it uses an easy to obtain step response as the model. The fuzzy DMC algorithm uses a few local step responses, they are switched using a fuzzy approach. Yet another alternative is to use a multiple model adaptive DMC control strategy [11]. The MPC algorithm described in [13] compensates for the nonlinearity of the process but it needs the inverse steady-state model. A model reference adaptive control scheme designed for the neutralization process is described in [29]. In [30] a scheduling quasi-min-max MPC with an infinite horizon (to guarantee stability) is used. In [39] an approach to MPC which uses partial-least-squares (PLS) based Hammerstein models is presented. Input constraints are transformed into a nonlinear region in terms of the so called latent variables. Unfortunately, the MPC algorithm needs nonlinear optimization. Economic set-point optimization of the pH reactor which cooperates with MPC is described in [33].
2 Model predictive control algorithms In MPC algorithms [34, 43, 46] at each sampling instant k, k = 1, 2, . . . , future control increments are calculated ⎡ ⎢ u(k) = ⎣
u(k|k) .. .
⎤ ⎥ ⎦.
(1)
u(k + Nu − 1|k) It is assumed that u(k + p|k) = 0 for p ≥ Nu , where Nu is the control horizon. The objective of MPC is to minimize differences between the reference trajectory y ref (k + p|k) and predicted output values y(k ˆ + p|k) over the prediction horizon N ≥ Nu . Hence, the following quadratic cost func-
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
tion is usually minimized on-line J (k) =
N
3 Multiple-input multiple-output neural Hammerstein models
y ref (k + p|k) − y(k ˆ + p|k)2M p
p=1
+
N u −1
u(k + p|k)2p .
175
(2)
p=0
In general, the multiple-input multiple-output (multivariable) Hammerstein model of dynamic processes consists of a steady-state nonlinear part x(k) = g(u(k))
As a result, future control increments (1) are calculated. In this paper multiple-input multiple-output processes are considered with nu inputs (manipulated variables) and ny outputs (controlled variables), i.e. u(k) ∈ Rnu , y(k) ∈ Rny . M p ≥ 0 and p > 0 are weighting matrices of dimensionality ny × ny and nu × nu , respectively. The second part of the cost function (2) penalizes excessive control increments. Only the first nu elements of the determined sequence (1) are applied to the process (i.e. control moves for the current sampling instant k), u(k) = u(k|k) + u(k − 1). At the next sampling instant, k + 1, output measurements are updated, the prediction is shifted one step forward and the whole procedure is repeated. Predicted values of output variables over the prediction horizon are calculated using a dynamic model. That is why the role of the model in MPC is crucial. MPC techniques are very model-based, the accuracy of the model significantly affects the quality of control. Future control increments are found on-line from the following optimization problem (for simplicity of presentation hard output constraints [34, 46] are used)
(4)
where g : Rnu −→ Rnx , in series with a linear dynamic part A(q −1 )y(k) = B(q −1 )x(k).
(5)
Auxiliary signals (i.e. outputs of the steady-state part) are denoted by x(k) ∈ Rnx . Two types of Hammerstein models are discussed. In the first case shown in Fig. 1 the nonlinear steady-state part of the model is realized by one neural network whereas in the second case depicted in Fig. 2 as many as nx neural networks are used. Structures of neural networks used in both types of Hammerstein models are shown in Figs. 3 and 4, respectively. In both types of Hammerstein models, the nonlinear steady-state part is realized by MultiLayer Perceptron (MLP) feedforward neural networks [18] with nu inputs, one hidden layer and linear outputs. In type I of the model, outputs of the neural network are 2 xr (k) = wr,0 +
min {J (k)}
K 2 wr,i ϕ(zi (k))
(6)
i=1
u(k)
where r = 1, . . . , nx . Sums of inputs of the i th hidden node are
subject to umin ≤ u(k + p|k) ≤ umax , −umax
p = 0, . . . , Nu − 1,
≤ u(k + p|k) ≤ umax ,
(3) 1 zi (k) = wi,0 +
p = 0, . . . , Nu − 1, y min ≤ y(k ˆ + p|k) ≤ y max ,
p = 1, . . . , N
where umin , umax , umax ∈ Rnu , y min , y max ∈ Rny define constraints imposed on the magnitude of input variables, increments of input variables and the magnitude of output variables, respectively. Fig. 1 The structure of the MIMO neural Hammerstein model of type I
nu 1 wi,j uj (k)
(7)
j =1
where ϕ : R −→ R is the nonlinear transfer function (e.g. hyperbolic tangent) used in the hidden layer of the network, K is the number of hidden nodes. Weights of the network 1 , i = 1, . . . , K, j = 0, . . . , n , and w 2 , are denoted by wi,j u r,i r = 1, . . . , nx , i = 0, . . . , K, for the first and the second
176
M. Ławry´nczuk
Fig. 2 The structure of the MIMO neural Hammerstein model of type II
Their outputs are r
xr (k) = w02,r +
K wi2,r ϕ(zir (k))
(9)
i=1
where sums of inputs of the i th hidden node are 1,r + zir (k) = wi,0
nu 1,r wi,j uj (k)
(10)
j =1
Fig. 3 The structure of the neural network used in type I of the Hammerstein model
and K r is the number of hidden nodes of the r th network 1,r (r = 1, . . . , nx ). Weights of networks are denoted by wi,j , i = 1, . . . , K r , j = 0, . . . , nu , r = 1, . . . , nx and wi2,r , i = 0, . . . , K r , r = 1, . . . , nx , for the first and the second layer, respectively. From (9) and (10) outputs of the steady-state part of the model are
nu Kr 2,r 2,r 1,r 1,r xr (k) = w0 + wi ϕ wi,0 + wi,j uj (k) . (11) j =1
i=1
Fig. 4 The structure of neural networks used in type II of the Hammerstein model (r = 1, . . . , nx )
layer, respectively. From (6) and (7) outputs of the steadystate part of the model are 2 xr (k) = wr,0
nu K 2 1 1 + wr,i ϕ wi,0 + wi,j uj (k) . i=1
(8)
j =1
In both types of the Hammerstein models the same dynamic linear part (5) is used. It is defined by the following polynomial matrices ⎡ ⎤ A1,1 (q −1 ) . . . 0 ⎢ ⎥ .. .. .. A(q −1 ) = ⎣ (12) ⎦ . . . −1 0 . . . Any ,ny (q ) and ⎡
B1,1 (q −1 ) ⎢ .. B(q −1 ) = ⎣ . Bny ,1 (q −1 )
⎤ B1,nx (q −1 ) ⎥ .. ⎦. . −1 Bny ,nx (q )
... .. . ...
(13)
Entries of matrices A(q −1 ), B(q −1 ) are polynomials in the backward shift operator q −1 Remark 1 In this paper two notation methods are used: vectors and scalars. For example, signals u(k), x(k) in (4) are vectors. When necessary or convenient, scalars are used, for example signals xr (k) in (6), r = 1, . . . , nx .
Ai,i (q −1 ) = 1 + a1i q −1 + · · · + ani A q −nA for i = 1, . . . , ny and Bi,j (q −1 ) = b1 q −1 + · · · + bnB q −nB i,j
In type II of the Hammerstein model as many as nx neural networks comprise the nonlinear steady-state part.
(14)
i,j
for i = 1, . . . , ny , i = 1, . . . , nx .
(15)
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
From (5), (12), (13), (14) and (15) the mth output of the Hammerstein model (m = 1, . . . , ny ) is ym (k) =
nA nB nx blm,r xr (k − l) − alm ym (k − l). r=1 l=1
(16)
ym (k) = fm (u1 (k − 1), . . . , u1 (k − nB ), . . . , unu (k − 1), . . . , unu (k − nB ), ym (k − 1), . . . , ym (k − nA ))
(17)
where functions fm : RnA +nu nB −→ R, m = 1, . . . , ny describe behavior of the process. In type I of the model shown in Fig. 1, using (8) and (16), one obtains nB nx blm,r ym (k) =
×
2 wr,0
nu K 2 1 1 + wr,i ϕ wi,0 + wi,j uj (k − l) j =1
i=1
−
nA alm ym (k − l).
(18)
l=1
In type II of the model shown in Fig. 2, using (11) and (16), one obtains nB nx m,r ym (k) = bl w02,r r=1 l=1
nu Kr 1,r 1,r + wi2,r ϕ wi,0 + wi,j uj (k − l) i=1
where quantities y(k + p|k) ∈ Rny are calculated from the model. Unmeasured disturbances d(k) ∈ Rny are assumed to be constant over the prediction horizon. They are estimated from d(k) = y(k) − y(k|k − 1)
l=1
In order to use the dynamic Hammerstein model in the MPC algorithm, it is necessary to express outputs of the model at the current sampling instant k as functions of inputs and outputs at previous sampling instants
r=1 l=1
nA − alm ym (k − l).
(21)
where y(k) are measured while y(k|k − 1) are calculated from the model. If for prediction a nonlinear neural Hammerstein model is used without any simplifications, predictions are nonlinear functions of the optimized future control sequence (1). As a result, the MPC optimization problem (3) is a nonlinear task which has to be solved on-line at each sampling instant. Although in theory such an approach seems to be potentially very precise, it has limited applicability because the computational burden of nonlinear MPC is enormous and it may terminate in local minima. That is why the MPC scheme with Nonlinear Prediction and Linearization (MPC-NPL) [32, 46] is adopted here. The structure of the algorithm is depicted in Fig. 5. At each sampling instant k the neural Hammerstein model is used on-line to find its local linearization and a nonlinear free trajectory. Thanks to linearization, the MPC optimization task becomes a quadratic programming problem, the necessity of on-line nonlinear optimization is eliminated. Let the linear approximation of the nonlinear neural Hammerstein model, obtained at the sampling instant k, be expressed as
q −1 )u(k) A(q −1 )y(k) = B(k,
(22)
where ⎡
1,1 (k, q −1 ) B ..
q −1 ) = ⎢ B(k, ⎣ .
ny ,1 (k, q −1 ) B
... .. . ...
⎤
1,nu (k, q −1 ) B ⎥ .. ⎦ (23) . −1
Bny ,nu (k, q )
and
j =1
(19)
i,j (k, q −1 ) = b˜ i,j (k)q −1 + · · · + b˜ni,jB (k)q −nB B 1
(24)
for i = 1, . . . , ny , j = 1, . . . , nu . The mth output of the linearized Hammerstein model (m = 1, . . . , ny ) is
l=1
4 Suboptimal MPC with neural Hammerstein models ym (k) = 4.1 Suboptimal MPC quadratic programming problem
nA nu nB alm ym (k − l). b˜lm,n (k)un (k − l) − n=1 l=1
Predicted values of output variables, y(k ˆ + p|k) ∈ Rny , over the prediction horizon (i.e. p = 1, . . . , N ) are calculated using a dynamic Hammerstein model of the process. The general prediction equation is y(k ˆ + p|k) = y(k + p|k) + d(k)
177
(20)
(25)
l=1
q −1 ) must have nu columns because the The matrix B(k, linearized mode has nu inputs (u1 , . . . , unu ) whereas the matrix B(q −1 ) has nx columns since auxiliary signals x1 , . . . , xnx are inputs of the linear part of the model. Coefficients b˜lm,n (k) of the linearized model (22) depend on the current operating point of the process.
178
M. Ławry´nczuk
Using the linearized model (25) recurrently, from the general prediction equation (20) one obtains predictions of consecutive outputs of the process (m = 1, . . . , ny ) over the prediction horizon (p = 1, . . . , N ) yˆm (k + 1|k) =
nu m,n b˜ (k)un (k|k) n=1
+ b˜2m,n (k)un (k − 1) + b˜3m,n (k)un (k − 2) + · · ·
(k)un (k − nB + 1) + b˜nm,n B − a3m ym (k − 2) − · · ·
yˆm (k + 2|k) =
(26)
m,n b˜ (k)un (k + 1|k) 1
(30)
min(j,n B)
b˜im,n (k) −
min(j −1,nA )
aim sjm,n −i (k).
(31)
i=1
Output predictions (29) can be expressed in the following compact form
+ b˜2m,n (k)un (k|k)
y(k) ˆ = G(k)u(k) + y 0 (k)
+ b˜3m,n (k)un (k − 1) + · · ·
(k)un (k − nB + 2) + b˜nm,n B
(32)
where ⎡
⎤ y(k ˆ + 1|k) ⎢ ⎥ .. y(k) ˆ =⎣ ⎦, .
− a1m yˆm (k + 1|k) − a2m ym (k) − a3m ym (k − 1) − · · · (27)
y(k ˆ + N |k)
⎤ y 0 (k + 1|k) ⎥ ⎢ .. y 0 (k) = ⎣ ⎦ . ⎡
y 0 (k + N |k) (33)
nu m,n yˆm (k + 3|k) = b˜1 (k)un (k + 2|k) n=1
+ b˜2m,n (k)un (k + 1|k) + b˜3m,n (k)un (k|k) + · · ·
(k)un (k − nB + 3) + b˜nm,n B − a1m yˆm (k + 2|k) − a2m yˆm (k + 1|k) − a3m ym (k) − · · · − anmA ym (k − nA + 3) + dm (k),
sjm,n (k) =
i=1
n=1
− anmA ym (k − nA + 2) + dm (k),
.. .
for j = 1, . . . , N − Nu + 1. Step-response coefficients of the linearized model are determined recurrently for j = 1, . . . , N from
− a1m ym (k) − a2m ym (k − 1)
nu
(29)
+ S 1 (k)u(k + 2|k) + · · · ,
where step-response submatrices are ⎤ ⎡ 1,1 sj (k) . . . sj1,nu (k) ⎥ ⎢ .. .. .. ⎥ S j (k) = ⎢ . . . ⎦ ⎣ n ,1 n ,n sj y (k) . . . sj y u (k)
1
− anmA ym (k − nA + 1) + dm (k),
y(k ˆ + 3|k) = S 3 (k)u(k|k) + S 2 (k)u(k + 1|k)
(28)
.. . For the considered multivariable process with nu inputs and ny outputs it is convenient to use the vector notation (u(k), u(k) ∈ Rnu , y(k) ∈ Rny ). Predictions can be expressed as functions of future control increments (the relation between future control increments and predictions is emphasized, the influence of the past is not shown) y(k ˆ + 1|k) = S 1 (k)u(k|k) + · · · , y(k ˆ + 2|k) = S 2 (k)u(k|k) + S 1 (k)u(k + 1|k) + · · · ,
are vectors of length ny N . The output prediction is expressed as the sum of a forced trajectory (G(k)u(k)), which depends only on the future (on future control moves) and a free trajectory y 0 (k), which depends only on the past. The dynamic matrix G(k) of dimensionality ny N × nu Nu is calculated on-line taking into account the current state of the plant ⎡ ⎤ S 1 (k) 0 ... 0 ⎢ S 2 (k) ⎥ S 1 (k) ... 0 ⎢ ⎥ G(k) = ⎢ . (34) ⎥. . . . .. .. .. ⎣ .. ⎦ S N (k) S N −1 (k) . . . S N −Nu +1 (k) It contains step-response coefficients of the local linear approximation of the nonlinear Hammerstein model. (Linearization and the free trajectory calculation is discussed in the following subsection.) It is important to emphasize the fact that for prediction in (32) a linearized Hammerstein model is used. In consequence, such a prediction (which is named in the MPC literature “the suboptimal prediction” [32, 46]) may be different from the real nonlinear prediction in which the nonlinear model is used without any simplifications. On the
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
179
other hand, thanks to using the suboptimal prediction equation (32), the general MPC optimization problem (3) becomes the following quadratic programming task min u(k),
ε min ,
εmax
ref y (k) − y 0 (k) − G(k)u(k)2 + u(k)2 + ρ min εmin 2 + ρ max εmax 2
subject to
(35)
umin ≤ J u(k) + u(k − 1) ≤ umax , − umax ≤ u(k) ≤ umax , y min − ε min ≤ y 0 (k) + G(k)u(k) ≤ y max + εmax , ε min
≥ 0,
εmax
Fig. 5 The structure of the MPC algorithm with nonlinear prediction and linearization
≥0
where ⎡
⎤
+ 1|k) ⎥ .. ⎦, . ref y (k + N|k) ⎡ min ⎤ ⎡ max ⎤ y y ⎥ ⎥ ⎢ ⎢ y max = ⎣ ... ⎦ y min = ⎣ ... ⎦ , ⎢ y (k) = ⎣
y ref (k
ref
(36)
(37)
y max
y min are vectors of length ny N , ⎤ umin ⎥ ⎢ umin = ⎣ ... ⎦ , ⎡
⎤ umax ⎥ ⎢ umax = ⎣ ... ⎦ ,
umin ⎡ max ⎤ u ⎢ .. ⎥ max u = ⎣ . ⎦,
⎡
umax
⎡
⎤ u(k − 1) ⎢ ⎥ .. u(k − 1) = ⎣ ⎦ .
(38)
u(k − 1)
umax
are vectors of length nu Nu , M = diag(M 1 , . . . , M N ), = diag(0 , . . . , Nu −1 ) are weighting matrices of dimensionality ny N × ny N and nu Nu × nu Nu , respectively, and I nu ×nu ⎢ I nu ×nu ⎢ J =⎢ . ⎣ ..
0nu ×nu I nu ×nu .. .
0nu ×nu 0nu ×nu .. .
... ... .. .
⎤ 0nu ×nu 0nu ×nu ⎥ ⎥ .. ⎥ . ⎦
I nu ×nu
I nu ×nu
I nu ×nu
...
I nu ×nu
⎡
vectors of length ny N comprising slack variables and ρ min , ρ max > 0 are weights. The bigger these weights, the bigger the penalty term ρ min εmin 2 + ρ max εmax 2 and, in consequence, the smaller the degree of output constraints violation. On the other hand, very big weights are not recommended, as they may lead to numerical problems. As it is shown in Fig. 5, at each sampling instant k the following steps are repeated in the MPC-NPL algorithm: 1. Linearization of the neural Hammerstein model: obtain coefficients b˜lm,n , m = 1, . . . , ny , n = 1, . . . , nu , l = 1, . . . , nB which comprise the step-response submatrices S j (k) (30) and the dynamic matrix G(k) (34). 2. Find the nonlinear free trajectory y 0 (k) using the neural Hammerstein model. 3. Solve the quadratic programming problem (35) to determine the optimal control policy u(k). 4. Apply to the process the first nu elements of the calculated vector, i.e. u(k) = u(k|k) + u(k − 1). 5. Increase the iteration of the algorithm, i.e. set k := k + 1, go to step 1. 4.2 Linearization, the free trajectory calculation
(39)
is the matrix of dimensionality nu Nu × nu Nu comprised of identity and zeros submatrices of dimensionality nu × nu . If hard output constraints are taken into account, the MPC optimization task may be affected by the infeasibility problem. In such a case the admissible set of the optimization problem is be empty. To cope with such a situation, output constraints have to be softened by slack variables [34, 46]. A quadratic penalty for constraint violations is used in the MPC-NPL optimization problem (35), εmin and εmax are
Coefficients of the linearized model (22), (25) are calculated from the Hammerstein model (18) or (19) as b˜lm,n (k) =
∂ym (k) ∂un (k − l)
(40)
for all m = 1, . . . ny , n = 1, . . . nu , l = 1, . . . nB . Derivatives of the model outputs (ym (k)) are calculated analytically, the structure of the model is exploited. Let vectors x¯ m (k) ∈ RnA +nu nB (m = 1, . . . , ny ) denote linearization points. They are comprised of past input and output signal values which are arguments of the Hammerstein model of the mth output (17)
180
M. Ławry´nczuk
x¯ m (k) = [u¯ 1 (k − 1) . . . u¯ 1 (k − nB ) . . . u¯ nu (k − 1) . . . u¯ nu (k − nB ) y¯m (k − 1) . . . y¯m (k − nA )]T .
(41)
In type I of the model, using (18) and (40), one has b˜lm,n (k) =
nx
blm,r
r=1
K
2 wr,i
i=1
dϕ(zi (x¯ m (k))) 1 wi,n . dzi (x¯ m (k))
(42)
If hyperbolic tangent is used as the nonlinear transfer function ϕ in the hidden layer of steady-state part of the model dϕ(zi (x¯ m (k))) = 1 − tanh2 (zi (x¯ m (k))). dzi (x¯ m (k))
(43)
signals applied to the process at previous sampling instants. Predictions also depend on previous predictions in a recurrent way (the second last sum) and on output signals measured at previous iterations (the last sum). The free trajectory is derived from (45) taking into account the fact that it depends only on the past, no changes in the control signal from the sampling instant k onwards are assumed, i.e. uj (k + p|k) should be replaced by uj (k − 1) for p ≥ 0. Moreover, future predictions yˆm (k + p|k) are replaced by values of the free 0 (k + p|k) for p ≥ 1. One has trajectory ym 0 ym (k + p|k)
=
nx
r
blm,r
r=1
K
wi2,r
i=1
blm,r
r=1 l=1
In type II of the model, using (19) and (40), one has b˜lm,n (k) =
nx I uf (p)
dϕ(zir (x¯ m (k))) 1,r wi,n . dzir (x¯ m (k))
× (44)
+
2 wr,0
nu K 2 1 1 + wr,i ϕ wi,0 + wi,j uj (k − 1) i=1 n B
nx
j =1
blm,r
r=1 l=Iuf (p)+1
0 (k + p|k) over the preThe nonlinear free trajectory ym diction horizon, p = 1, . . . , N , m = 1, . . . , ny is calculated on-line recursively from the general prediction equation (20) using the neural Hammerstein model (18) or (19). For prediction the Hammerstein model is used recurrently. In type I of the model, one obtains predictions over the prediction horizon (p = 1, . . . , N )
×
− −
0 ym (k
2 × wr,0 +
K 2 1 wr,i ϕ wi,0 i=1
nu
=
2 wr,0
+
nx
nB
alm ym (k − l + p) + dm (k).
(46)
+ p|k) nx I uf (p)
blm,r
×
blm,r
nu K 2 1 1 + wr,i ϕ wi,0 + wi,j uj (k − l + p) i=1
j =1
w02,r
nu Kr 2,r 1,r 1,r + wi ϕ wi,0 + wi,j uj (k − 1)
nx
i=1 nB
+
r=1
l=Iuf (p)+1
j =1
blm,r w02,r
nu 2,r 1,r 1,r + wi ϕ wi,0 + wi,j uj (k − l + p)
alm yˆm (k − l + p|k)
j =1
i=1
l=1 nA
Kr
Iyp (p)
−
l=1 nA
r=1 l=Iuf (p)+1
0 alm ym (k − l + p|k)
r=1 l=1
1 wi,j uj (k − l + p|k)
j =1
−
In type II of the model, the free trajectory is
blm,r
r=1 l=1
×
j =1
i=1
l=Iyp (p)+1
nx I uf (p)
+
nu K 2 1 1 + wr,i ϕ wi,0 + wi,j uj (k − l + p)
Iyp (p)
yˆm (k + p|k) =
2 wr,0
Iyp (p)
alm ym (k − l + p) + dm (k)
(45)
−
l=Iyp (p)+1
where Iuf (p) = min(p, nB ), Iyp (p) = min(p − 1, nA ). The first square brackets in (45) contain future values of control signals, which are actually calculated in the MPCNPL algorithm. The second square brackets contain control
−
0 alm ym (k − l + p|k)
l=1 nA
alm ym (k − l + p) + dm (k).
(47)
l=Iyp (p)+1
Using (18) and (21), in type I of the model, the estimate of the unmeasured disturbance affecting the mth output of
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
181
the process is dm (k) = ym (k) −
r=1 l=1
×
nB nx blm,r
2 wr,0
nu K 2 1 1 + wr,i ϕ wi,0 + wi,j uj (k − l) j =1
i=1
+
nA alm ym (k − l).
(48)
Fig. 6 The pH reactor
l=1
Using (19) and (21), in type II of the model, the estimate of the unmeasured disturbance is nB nx m,r dm (k) = ym (k) − bl w02,r r=1 l=1
+
Kr
wi2,r ϕ
1,r wi,0
+
i=1
+
nA alm ym (k − l).
nu
1,r wi,j uj (k
dWa4 (t) q1 (t)(Wa1 − Wa4 (t)) = dt Ah(t)
− l)
+
q2 (t)(Wa2 − Wa4 (t)) Ah(t)
+
q3 (t)(Wa3 − Wa4 (t)) , Ah(t)
(53)
dWb4 (t) q1 (t)(Wb1 − Wb4 (t)) = dt Ah(t)
j =1
(49)
+
q2 (t)(Wb2 − Wb4 (t)) Ah(t)
+
q3 (t)(Wb3 − Wb4 (t)) , Ah(t)
l=1
5 Simulation results
√ dh(t) q1 (t) + q2 (t) + q3 (t) − CV h(t) = dt A
5.1 Neutralization reactor
and an algebraic equation for the pH
The process under consideration is a multivariable pH neutralization process with two manipulated and two controlled variables. The schematic diagram of the process is depicted in Fig. 6. In the tank acid (HNO3 ), base (NaOH) and buffer (NaHCO3 ) are continuously mixed. Chemical reactions occurring in the system are
(55)
Wa4 + 10pH(t)−14 − 10−pH(t) + Wb4 (t)
1 + 2 × 10pH(t)−pK2 =0 1 + 10pK1 −pH(t) + 10pH(t)−pK2
(56)
where pK1 = − log10 Ka1 ,
+ H2 CO3 ↔ HCO− 3 +H ,
(50)
and
2− + HCO− 3 ↔ CO3 + H ,
(51)
H2 O ↔ OH− + H+ .
Ka1 =
(52)
The objective of the control system is to control the liquid level h in the tank and the value of the pH of the outlet stream q4 by manipulating the acid and base flow rates q1 and q3 , respectively. It means that from the perspective of supervisory MPC algorithms the plant has two manipulated variables (q1 , q3 ) and two controlled variables (h, pH). The fundamental dynamic model of the process is obtained from component balance and equilibrium relationship under the assumption of perfect mixing. The model consists of three nonlinear ordinary differential equations [33, 35]
(54)
+ [HCO− 3 ][H ] , [H2 CO3 ]
pK2 = − log10 Ka2
Ka2 =
+ [CO2− 3 ][H ]
[HCO− 3]
(57)
(58)
are equilibrium constants of reactions. The pH concentration is the maximum real solution of the nonlinear equation (56). Quantities Wai and Wbi denote reaction invariants. Values of model parameters and nominal operating conditions (q1 , q2 , q3 , h, pH) are given in Table 1. The pH neutralization process exhibits severe nonlinear properties. Steady-state characteristics of the reactor, h(q1 , q3 ) and pH(q1 , q3 ), are depicted in Fig. 7. In particular, the relation between manipulated variables and pH is nonlinear. Moreover, dynamic properties of the process are also nonlinear. Figure 8 shows step-responses of the reactor
182
M. Ławry´nczuk
(the value of pH) caused by increasing and decreasing the acid flow rate q1 by 1 ml/s and 5 ml/s. The sampling period is 10 s. For positive and negative changes in the flow rate q1 time-constants of obtained step-responses are different (it is
Table 1 Parameters of the model and nominal operating conditions Wa1 = 0.003 M
A = 207 cm2
Wa2 = −0.03 M
CV = 8.75 ml/cm s
Wa3 = −0.00305 M
q1 = 16.6 ml/s
Wb1 = 0 M
q2 = 0.55 ml/s
Wb2 = 0.03 M
q3 = 15.6 ml/s
Wb3 = 0.00005 M
h = 14.0090 cm
Ka1 = 4.47 × 10−7
pH = 7.0255
Ka2 = 5.62 × 10−11
not true for linear systems). Furthermore, unlike linear systems, the speed of the reaction depends on the value of the manipulated variable. 5.2 Neutralization reactor modeling for control Three types of models are considered: (a) The linear model with constant parameters y1 (k) = b11,1 u1 (k − 1) + b21,1 u1 (k − 2) + b11,2 u2 (k − 1) + b21,2 u2 (k − 2) − a11 y1 (k − 1) − a21 y1 (k − 2), y2 (k) = b12,1 u1 (k − 1) + b22,1 u1 (k − 2) + b12,2 u2 (k − 1) + b22,2 u2 (k − 2) − a12 y2 (k − 1) − a22 y2 (k − 2).
(59)
(60)
Fig. 7 Steady-state characteristics h(q1 , q3 ), pH(q1 , q3 ) of the process
Fig. 8 Step-responses of the reactor (the value of pH) caused by increasing (solid line) and decreasing (dashed line) the acid flow rate q1 by 1 ml/s (left) and 5 ml/s (right) at k = 0
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
(b) The neural Hammerstein model which can be expressed as general functions (17) y1 (k) = f1 (u1 (k − 1), u1 (k − 2), u2 (k − 1), u2 (k − 2), y1 (k − 1), y1 (k − 2)),
(61)
y2 (k) = f2 (u1 (k − 1), u1 (k − 2), u2 (k − 1), u2 (k − 2), y2 (k − 1), y2 (k − 2)).
(62)
(c) The polynomial Hammerstein model with second-order dynamic part (it can be also expressed as functions (61) and (62)), as the steady state part of the model two polynomials are used x1 (k) =
o1 o2
α1,n1 ,n2 un1 1 (k)un2 2 (k),
(63)
α2,n1 ,n2 un1 1 (k)un2 2 (k)
(64)
n1 =0 n2 =0
x2 (k) =
o1 o2 n1 =0 n2 =0
where integers o1 and o2 determine the order of polynomials, α1,n1 ,n2 and α2,n1 ,n2 are coefficients. All three empirical models have the same input arguments m,n determined by nm A = nB = 2, m = 1, 2, n = 1, 2. (Secondorder dynamics is used because first-order dynamics turns out to be insufficiently accurate while third-order one does not improve accuracy significantly.) Because input and output process variables have a different order of magnitude, they are scaled as 1 (q1 − q1,nom ), 15 1 u2 = (q3 − q3,nom ), 15 u1 =
1 (h − hnom ) 35 1 y2 = (pH − pHnom ) 4 y1 =
(65) (66)
where quantities q1,nom , q3,nom , hnom , pHnom correspond to the nominal operating point given in Table 1. During model training the following Mean Squared Error (MSE) performance function is minimized S 2 1 MSE = (ym (k|k − 1) − ym (k))2 S
(67)
k=1 m=1
where ym (k|k − 1) (m = 1, . . . , ny = 1, 2) denote outputs of the model for the sampling instant k calculated using signals up to the sampling instant k − 1, ym (k) are targets, i.e. recorded real values of process output variables, S is the number of samples. The fundamental model of the neutralization reactor (53)–(56) is used as the real process. The system of differential equations (53)–(55) is solved using the Runge-Kutta 45
183
method. The fundamental model is simulated open-loop in order to obtain three sets of data, namely training, validation and test data sets depicted in Fig. 9. Training and test data sets contain 3000 samples, the validation data set contains 1000 samples. The sampling time is 10 s. Output signals contain small measurement noise. Three data sets are used [42]. The training data set is used only for model identification (training in the context of neural models), i.e. the MSE performance function (67) is minimized on the training set. As a result, parameters of the model are optimized. The value of MSE index for the validation set is monitored during training. Usually, both training and validation MSE indices decrease during the initial phase of training. When the validation error increases, training is terminated (to avoid overfitting). Such an approach makes it possible to obtain models which have good generalization properties. Model selection is performed taking into account the value of MSE only for the validation data set. Finally, the third set (the test data set) is used to assess the generalization of the chosen model. It is important that the test data set is used neither for training nor for model selection. Therefore, the error on the test set gives an unbiased estimate of the generalization error. In case of the linear model, for identification the leastsquares method is used, whereas in case of Hammerstein models the MSE performance index (67) is minimized using the gradient method [25]. Gradients of this function are calculated analytically using the backpropagation scheme whereas the optimization direction is found by the quasi-Newton Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS) [5] which is well known for its efficiency (fast convergence and robustness). The golden section method is used in the line search procedure. Because training of neural Hammerstein models is in fact an unconstrained nonlinear minimization problem with the MSE performance index as the objective function, the identification experiment is repeated 10 times for each model configuration, weights of neural networks are initialized randomly. The best obtained models are presented in the following part of the article. At first, the linear model (59), (60) is found. Unfortunately, because the process exhibits significantly nonlinear steady-state and dynamic behavior as shown in Figs. 7 and 8, the accuracy of the linear model is very low. Outputs of the process and outputs of the linear model for training and validation data sets are compared in Fig. 10. The linear model is unable to precisely predict behavior of the process. In particular, because the steady-state relation pH(q1 , q3 ) is significantly nonlinear as shown in Fig. 7, the linear dynamic model of pH is very inaccurate. All compared neural Hammerstein models (61), (62) and polynomial Hammerstein models have the same input arguments. Both types of neural Hammerstein models are considered, in both cases nx = 2. In type I only one neural network with K hidden nodes realizes the steady-state part of
184
M. Ławry´nczuk
Fig. 9 The training data set (top panels), the validation data set (middle panels) and the test data set (bottom panels)
the model (Figs. 1 and 3). In type II two neural networks are used (Figs. 2 and 4) with K 1 and K 2 hidden nodes in the first and the second network, respectively. The hyperbolic
tangent transfer function is used in hidden layers of steadystate parts. As far as type I of the neural model is concerned, neural networks containing K = 1, . . . , 10 hidden nodes are
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
185
Fig. 10 Outputs of the process (solid line) vs. outputs of the linear model (dashed line) for the training data set (top panels) and for the validation data set (bottom panels)
considered. In the case of type II of the model, models containing two neural networks with K 1 = K 2 = 1, . . . , 7 hidden nodes are considered. In the steady-state part of the polynomial Hammerstein model polynomials of order o1 = o2 = 3, . . . , 7 are used. Properties of compared models in terms of the number of parameters and the MSE performance index are given in Table 2. Both neural and polynomial Hammerstein models are significantly more precise than the linear model. Unfortunately, polynomial Hammerstein structures have two disadvantages: lower accuracy and a very big number of parameters. Hence, when one wants to obtain an accurate model whose number of parameters if moderate, a neural Hammerstein model should be chosen. Considering neural Hammerstein models, it is evident that increasing the number of hidden nodes leads to reducing the MSE performance index for the training data set. On
the other hand, for model selection the MSE index for the validation data set is only taken into account. When Hammerstein models of type I are considered, it is obvious that the model with K = 7 hidden nodes should be chosen because it gives the smallest MSE index for the validation data set (MSEvalidation = 1.591 × 10−2 ), for more complicated models the MSE index slightly increases. When models of type II are considered, the model with K 1 = K 2 = 3 hidden nodes should be chosen (MSEvalidation = 1.594 × 10−2 ). When it comes to selecting the model which is next used in MPC algorithms not only good accuracy but also complexity should be taken into account. Considering these two neural Hammerstein models (i.e. type I with K = 7 hidden nodes and type II with K 1 = K 2 = 3 nodes) the model of type II is finally chosen as a reasonable compromise between accuracy and complexity. For the validation data set MSE index is slightly bigger for the model of type II, but the differ-
186
M. Ławry´nczuk
Table 2 Properties of linear, neural Hammerstein and polynomial Hammerstein models in terms of the number of parameters and the MSE performance index, the value of the performance function of the chosen model (the validation data set) is given in bold Model
Number of parameters
MSEtraining
MSEvalidation
MSEtest
Linear
12
5.981 × 10−2
6.819 × 10−2
–
Neural Hammerstein type I, K = 1
19
4.974 × 10−2
5.046 × 10−2
–
Neural Hammerstein type I, K = 2
24
1.662 × 10−2
1.696 × 10−2
–
Neural Hammerstein type I, K = 3
29
1.606 × 10−2
1.980 × 10−2
–
Neural Hammerstein type I, K = 4
34
1.344 × 10−2
1.688 × 10−2
–
Neural Hammerstein type I, K = 5
39
1.330 × 10−2
1.710 × 10−2
–
Neural Hammerstein type I, K = 6
44
1.323 × 10−2
1.664 × 10−2
–
Neural Hammerstein type I, K = 7
49
1.323 × 10−2
1.591 × 10−2
–
Neural Hammerstein type I, K = 8
54
1.321 × 10−2
1.617 × 10−2
–
Neural Hammerstein type I, K = 9
59
1.319 × 10−2
1.652 × 10−2
–
Neural Hammerstein type I, K = 10
64
1.316 × 10−2
1.671 × 10−2
–
Neural Hammerstein type II, K 1 = K 2 = 1
22
1.724 × 10−2
1.963 × 10−2
–
Neural Hammerstein type II, K 1 = K 2 = 2
30
1.608 × 10−2
1.915 × 10−2
–
38
1.350 × 10−2
1.594 × 10−2
1.505 × 10−2
1.600 × 10−2
–
Neural Hammerstein type II,
K1
= K2
Neural Hammerstein type II,
K1
= K2
=3 =4
46
1.363 × 10−2
Neural Hammerstein type II, K 1 = K 2 = 5
54
1.340 × 10−2
1.671 × 10−2
–
Neural Hammerstein type II, K 1 = K 2 = 6
62
1.336 × 10−2
1.648 × 10−2
–
70
1.335 × 10−2
1.731 × 10−2
–
order
44
1.724 × 10−2
1.948 × 10−2
–
Polynomial Hammerstein, 4th order
62
1.723 × 10−2
1.832 × 10−2
–
1.883 × 10−2
–
Neural Hammerstein type II, Polynomial Hammerstein,
K1
3rd
= K2
=7
order
84
1.729 × 10−2
Polynomial Hammerstein, 6th order
110
1.704 × 10−2
1.873 × 10−2
–
Polynomial Hammerstein, 7th order
140
1.686 × 10−2
1.820 × 10−2
–
Polynomial Hammerstein,
5th
ence is small (approx. 0.19%). It is important that the chosen model has only 38 parameters (including parameters of the linear dynamic part) whereas type I of the model with 7 hidden nodes has 49 parameters. In order to assess the generalization of the chosen neural Hammerstein model the MSE index for the test data set is calculated, (MSEtest = 1.505 × 10−2 ). Because this set is used neither for training nor for model selection, such a small value of MSEtest means that the model has good generalization properties. Outputs of the process and outputs of the chosen neural Hammerstein model for training, validation and test data sets are depicted in Fig. 11. 5.3 Neutralization reactor control During simulations of MPC algorithms two models of the process are used. The fundamental model (53)–(56) is used as the real process during simulations. Differential equations (53)–(55) are solved using the Runge-Kutta 45 method. The neural Hammerstein model is used in MPC algorithms. In order to emphasize the accuracy and the computational efficiency of the discussed MPC-NPL algorithm based on the
Hammerstein model, in the following part of the article two MPC algorithms are compared (a) the suboptimal MPC-NPL algorithm based on the neural Hammerstein model (61), (62), (b) the MPC algorithm with on-line Nonlinear Optimization (MPC-NO) based on the same model. Both algorithms are implemented in Matlab. In the MPCNPL algorithm the nonlinear model is linearized on-line and the future control policy is calculated by means of the quadratic programming procedure. In the MPC-NO algorithm the nonlinear model is used without any simplifications. In consequence, at each sampling instant a nonlinear optimization problem is solved. For this purpose Sequential Quadratic Programming (SQP) [5] nonlinear optimization algorithm is used. As the initial point for optimization, nu (Nu − 1) = 2Nu − 2 control values calculated at the previous sampling instant and not applied to the process are used. Parameters of MPC are: N = 10, Nu = 2, M p = diag(1, 1), p = diag(1, 1). (The literature concerning tuning of MPC algorithms is very rich, e.g. [44], this is-
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
187
Fig. 11 Outputs of the process (solid line) vs. outputs of the chosen neural Hammerstein model (dashed line) for the training data set (top panels), for the validation data set (middle panels) and for the test data set (bottom panels)
sue is not discussed here.) Manipulated variables are constrained q1min ≤ q1 ≤ q1max ,
q3min ≤ q3 ≤ q3max
(68)
where q1min = 10 ml/s, q1max = 30 ml/s, q3min = 10 ml/s, q3max = 30 ml/s. In both considered MPC algorithms two reference trajectories are used. The first trajectory is
188
M. Ławry´nczuk
Fig. 12 Simulation results (the reference trajectory 1): the MPC-NO algorithm (solid line) and the MPC-NPL algorithm (dashed line) with the same neural Hammerstein model
⎧ pHnom ⎪ ⎪ ⎪ ⎨8 pHref (k) = ⎪ 9 ⎪ ⎪ ⎩ 10
if k < 3, if 3 ≤ k ≤ 49, if 50 ≤ k ≤ 99,
(69)
if 100 ≤ k ≤ 150,
while the second trajectory is ⎧ pHnom if k < 3, ⎪ ⎪ ⎪ ⎨6 if 3 ≤ k ≤ 49, pHref (k) = ⎪ 5 if 50 ≤ k ≤ 99, ⎪ ⎪ ⎩ 4 if 100 ≤ k ≤ 150.
(70)
It is assumed that the liquid level should be stabilized on the constant value hnom , i.e. href (k) = hnom ∀k. Very similar reference trajectories (but slower) are used in [35]. Simulation results are depicted in Figs. 12 and 13. It is evident that the closed-loop performance obtained in the suboptimal MPC-NPL algorithm with quadratic programming is practically the same as in the computationally pro-
hibitive MPC-NO approach in which at each sampling instant a nonlinear optimization problem has to be solved online. At the same time, the difference in computational burden is big. Table 3 shows the computational complexity of compared MPC-NPL and MPC-NO algorithms based on the same neural Hammerstein model in terms of floating point operations (MFLOPS) for different control horizons (total results are given, the computational burden is summarized for both reference trajectories). The control horizon has bigger impact on the computational burden than the prediction horizon since 2Nu is the number of decision variables of the MPC optimization problem. For Nu = 1 the suboptimal MPC-NPL algorithm is 5.92 times computationally less demanding that the MPC-NO algorithm whereas for Nu = 10 this factor increases to 37.7. Unmeasured disturbances are unavoidable in practice. Figures 14 and 15 show simulation results in presence of noisy output measurements (the noise level is 3%). Similarly as previously, closed-loop trajectories obtained in the MPC-
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
189
Fig. 13 Simulation results (the reference trajectory 2): the MPC-NO algorithm (solid line) and the MPC-NPL algorithm (dashed line) with the same neural Hammerstein model
Table 3 The computational complexity in terms of floating point operations (MFLOPS) of MPC-NO and MPC-NPL algorithms for different control horizons Nu
MPC-NPL
MPC-NO
1
2.69
15.92
2
3.50
38.76
3
4.65
73.54
5
8.40
204.68
10
29.56
1084.77
of controlled variables over the whole simulation horizon
J1 =
150 (|href (k) − h(k)| + |pHref (k) − pH(k)|)
(71)
k=1
and the sum of squared differences
J2 =
150 ((href (k) − h(k))2 + (pHref (k) − pH(k))2 )
(72)
k=1
NPL algorithm are practically the same as in the MPC-NO algorithm. Two typical control performance indices are calculated after completing simulations. The sum of absolute values of differences between reference trajectories and actual values
are calculated. Parameters J1 and J2 obtained in both described experiments are given in Tables 4 and 5 (total results are given, performance indices are summarized for both reference trajectories). For MPC-NO and MPC-NPL algorithms values of indices J1 and J2 are very similar as both algorithms give comparable control accuracy.
190
M. Ławry´nczuk
Fig. 14 Simulation results in presence of unmeasured disturbances (the reference trajectory 1): the MPC-NO algorithm (solid line) and the MPC-NPL algorithm (dashed line) with the same neural Hammerstein model
Table 4 Performance indices J1 and J2 of MPC-NPL and MPC-NO algorithms MPC-NPL
MPC-NO
J1
55.40
56.99
J2
37.36
38.37
Table 5 Performance indices J1 and J2 of MPC-NPL and MPC-NO algorithms in presence of unmeasured disturbances MPC-NPL
MPC-NO
J1
56.60
62.07
J2
36.26
38.75
6 Conclusions Among numerous types of models, Hammerstein structures can be efficiently used for modeling of various processes.
Although in the simplest case polynomials can be used in the steady-state part of the model, they have many drawbacks as discussed in the introduction and in [25]. In particular, for multivariable processes the number of coefficients is usually very big. The neural Hammerstein model is a sound alternative to the classical polynomialbased Hammerstein structure. Neural networks can precisely approximate steady-state properties of the process, they have a relatively small number of parameters and their structure is simple. For the discussed chemical reactor, all evaluated polynomial Hammerstein models have significantly lower accuracy than neural Hammerstein structures and a big number of parameters as shown in Table 2. Neural Hammerstein models can be efficiently used in the suboptimal MPC-NPL algorithm detailed in this paper. It has many advantages: reliability, computational efficiency and closed-loop accuracy. The model is linearized on-line, as a result a numerically reliable quadratic pro-
Suboptimal nonlinear predictive control based on multivariable neural Hammerstein models
191
Fig. 15 Simulation results in presence of unmeasured disturbances (the reference trajectory 2): the MPC-NO algorithm (solid line) and the MPC-NPL algorithm (dashed line) with the same neural Hammerstein model
gramming procedure is used on-line, the necessity of re-
References
peating nonlinear optimization at each sampling instant is eliminated. Although suboptimal, it is shown that for the neutralization process in practice the algorithm gives control performance comparable to that obtained in MPC with nonlinear optimization repeated at each sampling instant. The MPC-NPL algorithm based on the neural Hammerstein model has been also successfully applied to a distillation column as described in [31]. It is necessary to emphasize the fact that implementation of the MPC-NPL algorithm (on-line linearization and the free trajectory calculation) is relatively easy as described in Sect. 4.2. Finally, in contrast to other algorithms (e.g. [13]) the discussed algorithm does not need the inverse steady-state model of the process.
1. Abonyi J, Babuška J, Ayala Botto M, Szeifert F, Nagy L (2000) Identification and control of nonlinear systems using fuzzy Hammerstein models. Ind Eng Chem Res 39:4302–4314 2. Al-Duwaish H, Karim MN, Chandrasekar V (1997) Hammerstein model identification by multilayer feedforward neural networks. Int J Syst Sci 28:49–54 3. Aoubi M (1998) Comparison between the dynamic multi-layered perceptron and generalised Hammerstein model for experimental identification of the loading process in diesel engines. Control Eng Pract 6:271–279 4. Åkesson BM, Toivonen HT (2006) A neural network model predictive controller. J Process Control 16:937–946 5. Bazaraa MS, Sherali J, Shetty K (1999) Nonlinear programming: theory and algorithms. Prentice Hall, New York 6. Billings SA, Fakhouri SY (1982) Identification of systems containing linear dynamic and static nonlinear elements. Automatica 18:15–26 7. Chan KH, Bao J (2007) Model predictive control of Hammerstein systems with multivariable nonlinearities. Ind Eng Chem Res 46:168–180
192 8. Chang FH, Luus R (1971) A noniterative method of identification using Hammerstein model. IEEE Trans Autom Control 16:464– 468 9. Chen J, Huang TC (2004) Applying neural networks to on-line updated PID controllers for nonlinear process control. J Proc Control 14:211–230 10. Ej Dempsey, Westwick DT (2004) Identification of Hammerstein models with cubic spline nonlinearities. IEEE Trans Biomed Eng 51:237–245 11. Dougherty D, Cooper D (2003) A practical multiple model adaptive strategy for single-loop MPC. Control Eng Pract 11:141–159 12. Eskinat E, Johnson S, Luyben WL (1991) Use of Hammerstein models in identification of nonlinear systems. AIChE J 37:255– 268 13. Fruzzetti KP, Palazo˘glu A, McDonald KA (1997) Nonlinear model predictive control using Hammerstein models. J Proc Control 7:31–41 14. Goethals I, Pelckmans K, Suykens JAK, De Moor B (2005) Identification of MIMO Hammerstein models using least squares support vector machines. Automatica 41:1263–1272 15. Greblicki W (1989) Non-parametric orthogonal series identification Hammerstein systems. Int J Syst Sci 20:2355–2367 16. Gustafsson T, Waller K (1992) Nonlinear and adaptive control of pH. Ind Eng Chem Res 31:2681–2693 17. Harnischmacher G, Marquardt W (2007) Nonlinear model predictive control of multivariable processes using block-structured models. Control Eng Pract 15:1238–1256 18. Haykin S (1999) Neural networks—a comprehensive foundation. Prentice Hall, Englewood Cliffs 19. Henson MA (1998) Nonlinear model predictive control: current status and future directions. Comput Chem Eng 23:187–202 20. Henson MA, Seborg DE (1994) Adaptive nonlinear control of a pH neutralization process. IEEE Trans Control Syst Technol 2:169–182 21. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359– 366 22. Hu Q, Saha P, Rangaiah GP (2000) Experimental evaluation of an augmented IMC for nonlinear systems. Control Eng Pract 8:1167– 1176 23. Hussain MA (1999) Review of the applications of neural networks in chemical process control—simulation and online implementation. Artif Intell Eng 13:55–68 24. Ingram A, Matthew A, Franchek A, Balakrishnan V, Surnilla G (2005) Robust SISO H∞ controller design for nonlinear systems. Control Eng Pract 13:1413–1423 25. Janczak A (2004) Identification of nonlinear systems using neural networks and polynomial models: block oriented approach. Springer, London 26. Knohl T, Xu WM, Unbehauen H (2003) Indirect adaptive dual control for Hammerstein systems using ANN. Control Eng Pract 11:377–385 27. Lakshminarayanan S, Shah SL, Nandakumar K (1995) Identification of Hammerstein models using multivariate statistical tools. Chem Eng Sci 50:3599–3613 28. Ling WM, Rivera D (1998) Nonlinear black-box identification of distillation column models—design variable selection for model performance enhancement. Int J Appl Math Comput Sci 8:793– 813
M. Ławry´nczuk 29. Loh AP, De DS, Krishnaswamy PR (2004) pH and level controller for a pH neutralization process. Ind Eng Chem Res 40:3579–3584 30. Lu Y, Arkun Y, Palazo˘glu A (2004) Real-time application of scheduling quasi-min-max model predictive control to a benchscale neutralization reactor. Ind Eng Chem Res 43:2730–2735 31. Ławry´nczuk M (2008) Suboptimal nonlinear predictive control with MIMO neural Hammerstein models. In: Nguyen NT, Borzemski L, Grzech A, Ali M (eds) Proceedings of the 21th international conference on industrial, engineering & other applications of applied intelligent systems, IEA-AIE 2008, Wrocław, Poland. New frontiers in applied artificial intelligence. Lecture notes in artificial intelligence, vol 5027, pp 225–234 32. Ławry´nczuk M (2007) A family of model predictive control algorithms with artificial neural networks. Int J Appl Math Comput Sci 17:217–232 33. Ławry´nczuk M, Marusak P, Tatjewski P (2007) Economic efficacy of multilayer constrained predictive control structures: an application to a MIMO neutralisation reactor. In: Proceedings of the 11th IFAC/IFORS/IMACS/IFIP symposium on large scale systems: theory and applications, Gda´nsk, Poland, CD-ROM, paper no 93 34. Maciejowski JM (2002) Predictive control with constraints. Prentice Hall, Harlow 35. Marusak P (2009) Advantages of an easy to design fuzzy predictive algorithm in control systems of nonlinear chemical reactors. Appl Soft Comput 9:1111–1125 36. Narendra KS, Gallman PG (1966) An iterative method for the identification of the nonlinear systems using the Hammerstein model. IEEE Trans Autom Control 12:546–550 37. Neši´c D, Mareels IMY (1998) Dead-beat control of a simple Hammerstein models. IEEE Trans Autom Control 43:1184–1188 38. Nørgaard M, Ravn O, Poulsen NK, Hansen LK (2000) Neural networks for modelling and control of dynamic systems. Springer, London 39. Patwardhan RS, Lakshminarayanan S, Shah SL (1998) Constrained nonlinear MPC using Hammerstein and Wiener models. AIChE J 44:1611–1622 40. Piche S, Sayyar-Rodsari B, Johnson D, Gerules M (2000) Nonlinear model predictive control using neural networks. IEEE Control Syst Mag 20:56–62 41. Qin SJ, Badgwell TA (2003) A survey of industrial model predictive control technology. Control Eng Pract 11:733–764 42. Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge 43. Rossiter JA (2003) Model-based predictive control. CRC Press, Boca Raton 44. Scattolini R, Bittanti S (1990) On the choice of the horizon in long-range predictive control—some simple criteria. Automatica 26:915–917 45. Smith JG, Kamat S, Madhavan KP (2007) Modeling of pH process using wavenet based Hammerstein model. J Proc Control 17:551– 561 46. Tatjewski P (2007) Advanced control of industrial processes, Structures and algorithms. Springer, London 47. Yu DL, Gomm JB (2003) Implementation of neural network predictive control to a multivariable chemical reactor. Control Eng Pract 11:1315–1323 48. Vörö J (1997) Parameter identification of discontinuous Hammerstein systems. Automatica 33:1141–1146