Proceedings of the 41st IEEE Conference on Decision and Control Las Vegas, Nevada USA, December 2002
WeA01-4
Adaptive switching supervisory control of nonlinear systems with no prior knowledge of noise bounds David Angeli and Edoardo Mosca D i p a r t i m e n t o di S i s t e m i e I n f o r m a t i c a , UniversitS~ di F i r e n z e , V i a di S. M a r t a 3, 50139 F i r e n z e
Abstract. The problem of controlling nonlinear uncertain noisy systems is approached via the introduction of a supervisor which, whenever needed, switches on, in feedback to the plant, a controller selected from a finite set of predesigned controllers. A Lyapunov-based falsification criterion allows one to ensure robust stability in the presence of uncertain constant parameters and exogenous bounded disturbances.
that each plant model in the set performs "satisfactorily" when controlled by at least one of the Ci's. Then, a suitably designed supervisory unit takes care of orchestrating the switching among the candidate controllers so as to preserve closed loop stability and, possibly, performance in the face of plant changes
1 Introduction
One of the well-estabilished approaches for controlling time-varying uncertain plants is the introduction of adaptation in the feedback loop [2]. However, converttional continuous adaptation is not always capable of performing satisfactorily mainly because of its inherent difficulty in taking advantage of prior knowledge of potential plant changes. This is particulartly true whenever the plant switches among different modes of operation or in the absence of sufficient excitation. In both circumstances undesirable transients may typically arise due to slow adaptation. In recent years, adaptive switching supervisory control (SSC) has emerged as an alternative approach for tackling the problem [3, 4, 6, 9, 14] with its appealing inherent feature of resembling an adaptive version of classic gain-scheduling control which has been successful in so many applications. As a matter of fact, SSC aims at extending gain-scheduling control to cases where the supervisor has no full information on the current dynamical behaviour of the plant to be controlled. A typical situation is the one where only records of past plant I / O data are available in order to let the supervisor decide whether the current controller is adequate, and, in the negative, select another candidate controller. If for all potential changes the plant belongs to a prespecified set of models, the key idea of SSC is to have a, usually finite, family of candidate controllers Ci, such
0-7803-7516-5/02/$17.00 ©2002 IEEE
1187
Switching mechanisms are usually based on a supervisory logic whereby a controller is falsified whenever the inferred behaviour of another controller turns out to be better than the one actually achieved by the currently operating controller. Whenever this happens, the candidate controller with the best inferred behaviour is switched on in feedback to the plant, replacing the currently operating controller. To the best of authors' knowledge, available controller falsification approaches rely on the assumption that either disturbance bounds are a priori given [1, 14] or the disturbance colour is known [10]. If such prior information is imprecise, either the falsification rate becomes too high (underestimated bounds or wrong colour) or the acting controller is falsified only after the plant I / O variables have become too large with respect to nominal performance specifications (overestimated disturbance amplitude). The main contribution of this paper is to present falsification and inference criteria integrated in a new supervisory switching logic, whereby no prior information on disturbance bounds is required and which are applicable to a wide class of linear and nonlinear plants.
2 Problem formulation Consider a discrete time nonlinear system of the following form:
x(t + 1)
-
N
f (x, u, d, O)
"
f(x(t),u(t),d(t),O) f (x, d, O) + g(x, u)
(1)
with states x C ~ , control inputs u C L/ C ~P, exogenous disturbance d(t) C 7) C ~q and unknown parameter vector 0 belonging to a compact set 0 . The aim is to jointly designing a finite family of candidate controllers in the form of state-feedback control laws as well as a supervisory unit responsible for orchestrating
the switching among the controllers in such a way that the resulting closed-loop system be input to state stable (ISS) [13]. The supervisory logic we are looking for has to be capable of handling the case of possibly timevarying parameters. The supervisory unit is devised so as to deal with possibly large uncertainties by adaptively selecting a suitable feedback gain among a finite family of pre-designed controllers. In this respect, the crucial assumption is the existence of a finite cover for O,
o c U
case of constant uncertain parameters. The switching logic operates by comparing a set of performance signals Ai(t) generated as follows -
6;(t)
(2)
"- { 1 , 2 , . . . N } , a family of Lyapunov functions ¢o troH rs -+ with the property that for all 0 E O i, x C ~'~ and d C D
V~(x)
0 plays the role of an additive hysteresis constant, and 5ij is the Kroneker's 5. Alternatively we may adopt a scale-independent hysteresis, [8], by setting:
i*(t) "- argm!n{Ai(t)(1 - cSii*(t-1))}.
N
1),ki(x(t-
max { Ai(t - 1), 7 [ 1(Si(t)) }
A (t) < max Id(w)l
where I" I denotes the Euclidean norm, for some ~ functions ai, %. Moreover, each function lf/(x) saristies the following bounds ~_i(Ixl) _< Vi(x) _< (~i(xl) for some a_i,(~i of class ~ . In other words, the first step of the adopted procedure is to design a bank of robustly input-to-state stabilizing controllers ki(x) with their associated Iss-Lyapunov functions, [7]. Notice that, thanks to the special decoupled form of (1) it is possible, based on the knowledge of the current and past states, to compute
x(t i)
"
L e m m a 3.1 Let the unknown parameter vector 0 C O be constant. Then, there exists an index i C N such that for all t C Z+
N
-
Vi(x(t i)) - V i ( x ( t - 1)) + ai(Ix(t - 1)1 )
The next lemma is a consequence of definition (4).
iEN
V~(](x,k~(x),d,O))
0
"
1 ) ) , d ( t - 1),0)
(7)
1))) L e m m a 3.2 Let 0 C O be constant and ~(t) C N denote the number of switches occurred up to time t. Assume that system (1) is fed at time t C Z+ by the control = i*(t) is sd ct d ccordi g to (6). Then ~(t) can be upperbounded as follows:
viz. the value of the state at t if the i-th controller would have been used in the loop at t - 1. The switching algorithm we propose improves on the one in [1] in various directions. First, no bounds on the state disturbance (its magnitude is in some sense adaptively estimated on-line) are hereafter assumed to be apriori known. Second, the "exhaustive spanning property" which was instrumental to the proof of stability in [1] need not be enforced. In fact, the switching logic in [1] is designed in such a way that the supervisor switcheson in feedback to the plant all elements in the family of candidate controllers after a sufficiently large number of switches, regardless of plant I / O data. This constraint need not be enforced in the approach of this paper, as the adoption of a perfomance-based switching criterion provides convergence of switching in finite time.
~(t) a > 0 for all t and all i E { 1 , . . . , N}. Then, we may consider as performance signals the logarithms of the Ai's. In fact:
=
-a~*(Ix(t- 1)1)+ ~* (/x~.(t))
(10)
4 Handling time-varying parameters The supervisory logic (4) and (6) is capable to guarantee ISS under the condition that 0 is constant. In the case of a time-varying parameter vector, very poor performance may result even if 0 becomes constant after a finite time. This happens because performance signals computed as in (4) are monotonically non-decreasing and therefore a controller which performed unsatisfactorily in the past requires that all the remaining controllers be actually switched-on and perform at least as badly, before having a chance of being again switchedon in feedback to the system. Several alternatives for effectively dealing with the time-varying parameter
Practical ISS in the sense of [13] for the overall scheme can now be proved. T h e o r e m 1 Let the parameter vector 0 be constant.
Then, system (1) controlled by the supervised statefeedback: ~(t) = k~. (o (x(t) ) where i*(t) is selected according to the supervisory logic in (4) and (6), is practically Input to State Stable. []
1189
5 Handling non decoupled uncertainty
case are discussed and justified on the basis of heuristic considerations.
A central assumption in the development of the performance evaluation logic (4) is the decoupled form of (1). This allows computation of what the state would be following the activation of any controller in closedloop to the plant, without physically plugging in the controller. In this section we discuss an alternative approach which, without assuming any particular structure for the system, still guarantees satisfactory asymptotic properties of the overall control system. Throughout this section we assume for the plant the following uncertain nonlinear discrete-time model:
4.1 P e r i o d i c reset Slowly time-varying parameters can be easily managed by introducing, in the performance signal generation (4) a periodic reset as follows-
if t rood T - 0, then Ai(t) - 0,
Vi e N
(13)
where T is the reset period. Sometimes it might be convenient to inhibit switching for a certain number of samples right after a reset has occured, so as to avoid spurious commutations (this interval is usually referred to in the literature as a dwell time ). The period T can be chosen by trading-off readiness of the algorithm in detecting performance degradation vs. false-alarm rate.
The idea is to modify (4) by only updating at each time-iteration the performance signal relative to the controller currently operating in the loop:
zx (0) - 0
4.2 F i n i t e t i m e w i n d o w A valid alternative, which avoids an explicit use of dwell times, is to adopt a finite time-window in the generation of performance signals. Specifically, in such a case the updating law (4) is modified as follows:
5 (t)
zx (t)
max
jE0,1,...T--1
3'~ 1 (Si(t - j))
+c~i*(t-1) ( x ( t -
zxj(t) A j ( t - 1) if j 7~ i*(t - 1). Notice that, all of the properties which allowed to derive Theorem I are preserved by the performance signal generation algorithm in (17). In particular:
where T is the time-window length. Also here, T should be selected by trading-off readiness of the algorithm vs. rate of false alarms.
• monotonicity of the signals Ai(t)
4.3 E x p o n e n t i a l f o r g e t t i n g If memory occupation is an issue of some concern, then an exponential forgetting factor might be of some help. E.g. the following updating laws can be adopted in place of (4):
=
0
5i(t)
-
Vi(x(tli)) - V i ( x ( t - 1)) + c~i(Ix(t- 1)1 )
"
max { , ~ A i ( t - 1),7~1(5i(t))}
Ai(t)
1)1 )
max { A j ( t - 1), 7~-1(5(t)) } if j - i * ( t - 1)
(14)
zx (0)
(17)
5(t) " V / . ( t _ l ) ( x ( t ) ) - V i . ( t _ ~ ) ( x ( t - 1))
Vi(x(t i)) - Vi(x(t - 1)) + ai(]x(t - 1)1 )
=
(16)
x(t + 1) - f (x(t), u(t), d(t), 0).
• existence, for 0 constant of an index i* such that ZXi~(t) _< max~E~,t I Id(~)l • hysteresis (additive or multiplicative) switching logic
in the
Therefore, along the same lines as Theorem 1 the following result can be proved"
(15)
where ,~ C (0, 1) is the forgetting-factor, which corresponds to a time-window length approximately equal to 1/(1 - ,~).
T h e o r e m 2 Let the parameter vector ~ be constant. Then, system (16) controlled by the supervised statefeedback: = (x(t) )
R e m a r k 4.1 It is worth pointing out that, when performance signals are guaranteed to be monotone nondecreasing functions of time (as in (4)), then a controller is switched off at t, only if an increase in the corresponding noise-estimate occurs. When measures are taken in order to deal with time-varying parameters, typically monotonicity of the performance signals is destroyed; therefore, in order to prevent high rates of false alarms it is convenient to disable falsification whenever the performance signal happens to be nonincreasing. []
where i*(t) is selected according to the supervisory logic (17) and (6), is practically Input to State Stable. [] If time-varying parameters are considered, it is convenient to introduce additional logics as in the previous sections. It is worth mentioning that carrying out performance updates only when a controller is switchedon in feedback to the plant inevitably yields longer and stronger transients.
1190
lY
/
I
0 U o
5oo
1o' .
.
.
.
'.
.
.
.
o.
.
.
.
.
F i g u r e 2: Disturbance force
F i g u r e 1: A double cart with elastic coupling 6 An example: double cart with uncertain elastic coupling
(a) 1o I
Consider the simple mechanical system in Fig. 1. We assume that cart masses are unitary, therefore a continuous-time linear model of the plant under investigation is as follows:
/ / ~ / [ / /
ii! o
!il
-
-e(xl
- x2) - 33cl + u
~2
=
-O(x2
- xl ) - fl~2.
(18)
"~
5oo
lo'oo
.....'-
(b)
F i g u r e 3: Gain scheduling: (a) system output, (b) O(t) and gain scheduler selection
Assuming the state x - [xl,~cl,x2,~c2]' available for feedback, it is convenient to rewrite (18) in state-space form as:
subject toic y(t)
-
(~o + O ~ l ) x ( t ) + G u u ( t ) + Gdd(t)
-
Hx(t)
Qi-Q~
0< ~o-
G U
0
1
0
0
0
-3
0
0
0 0
0 0
0 0
1 -fl
~1-
'
0
0
0
0
-1
0 1
0
1
0
0 0
0 -1
1 0
0
O,
p2Q~ Y / G ~ + Qi~'i
p2Q~ Y/G~ + Qi(~
pi