51st IEEE Conference on Decision and Control December 10-13, 2012. Maui, Hawaii, USA
Non-local Stability of a Nash Equilibrium Seeking Scheme with Dither Re-use Ronny J. Kutadinata, William H. Moase and Chris Manzie Abstract— This paper considers a simple Nash-equilibriumseeking (NES) scheme acting on a family of plants that can be represented as non-cooperative games. Specifically, the family of plants under investigation have a number of inputs, each of which can be associated with a measured cost. The NES scheme consists of a number of decentralised extremum-seeking (ES) agents which, without requiring knowledge of the underlying plant dynamics, control each input in order to minimise its associated steady-state cost. A non-local stability result for the NES scheme is provided which allows two agents to use the same dither signal if the effect of each agent on the other’s steadystate cost function is sufficiently weak. It is then demonstrated in simulation that, by reducing the number of distinct dither signals, the proposed scheme is able to more quickly seek the Nash equilibrium.
I. INTRODUCTION Many contemporary engineering challenges, such as urban traffic networks, irrigation networks and groups of UAVs, involve the optimisation of large-scale systems. The difficulty of addressing these challenges is compounded when important system parameters are unknown or even the system model itself is unknown. Such scenarios would benefit from the use of some form of online optimisation, such as ES. ES is a non-model-based method of regulating the inputs of a plant to within a small region of the values that, in the steady-state, minimise/maximise the system’s output (which might be a measure of the system’s performance). Typically, ES schemes involve the addition of a small dither to the plant’s input. By observing the effect of the dither on the plant’s output, it is possible to estimate the gradient of steady-state relationship between the plant’s inputs and output. This estimate can then be used in a gradient-based scheme to adapt the plant’s input such that the steady-state output is minimised or maximised. Local stability results for a family of ES schemes using sinusoidal dithers was presented in [1]. A semi-global practical asymptotic (SPA) stability result (with respect to the design parameters) for a simple sinusoidal-dither-based ES scheme was later presented in [2]. The stability result relies upon singular perturbation and periodic averaging techniques which decompose the closed-loop dynamics into three distinct time-scales with: the plant dynamics acting in the fastest time-scale; the dither and gradient estimation acting in a slower time-scale; and the gradient-based adaptation of the plant input occurring in the slowest time-scale. The analysis and design of a variety of R. J. Kutadinata, W. H. Moase and C. Manzie are with the Department of Mechanical Engineering, The University of Melbourne, 3010, Victoria, Australia
[email protected],
[email protected],
[email protected] 978-1-4673-2064-1/12/$31.00 ©2012 IEEE
other ES schemes has been discussed in [3], [4], [5], [6]. However, the majority of the results in ES theory address only single-input single-output (SISO) systems. Some ES results that consider multi-input single-output (MISO) systems include [7], [8], [9], [10]. A SPA stability result presented in [10] highlights two important requirements of a MISO ES scheme: the closed-loop must have the same three time-scale tuning as that discussed in [2]; and the number of distinct frequencies used in the dithers must increase in proportion to the number of plant inputs. As the number of inputs to a given system is increased, it is reasonable to expect that the latter of these requirements will result in the dither and gradient estimation task occupying an increasingly wide range of time-scales. When combined with the first of these requirements, one can expect that this contributes to the increasingly slow adaptation of the plant inputs toward their optimal values. Additionally, slow adaptation may also be caused by the increasing difficulty of the optimisation problem as its dimension grows. One potential solution to these problems involves the use of decentralised ES techniques. Decentralised ES schemes such as those discussed in [11], [12], [13] feature many of the same characteristics as SISO and MISO ES schemes: dither signals; gradient estimation; and gradient-based adaptation of the input. However, rather than attempting to optimise a single output (or global cost function), groups of inputs are controlled by ES “agents”, each of which has its own output/cost to optimise. This results in a non-cooperative game being played out between the ES agents, the solution of which is a Nash equilibrium. For this reason, decentralised ES schemes are also referred to as “NES schemes”. [14] provides a local stability result for a simple NES scheme that bears many similarities to the SISO ES scheme considered in [2]. A particularly noteworthy result of [14] is that local stability of the Nash-equilibrium requires, in some sense, for each agent’s cost function to be more strongly dependent upon its own input than it is upon all of the other agents. To a large extent, NES schemes have been proposed as solutions to problems that are already described by non-cooperative games. As a result of this, there has been little focus on the use of decentralisation in order to improve the convergence speed of ES on large-scale systems. For example, as with many other NES schemes, the results of [14] require each input to the plant to be perturbed with a sinusoidal dither of a unique frequency. Thus, when applied to large-scale systems, the NES scheme can be expected to suffer similar limitations to traditional MISO ES approaches. One NES approach that addresses this issue is discussed
6077
in [11], which allows dithers to be re-used between agents who do not affect each other’s output/cost. However, such a result cannot be exploited in systems where each agent’s output/cost is potentially dependent upon all of the inputs. This paper considers a simple NES scheme acting on a family of systems where the effect of an input on any output/cost can be guaranteed to be sufficiently weak when some measure of the “distance” between the input and output is sufficiently large. In a sense, the family of systems considered lies between those considered in [14] (which requires a given agent’s input a have a strong effect on its own cost function, but doesn’t require a strict decay in the effect of an agent’s input with distance) and [11] (which only allows direct “neighbours” to affect an agent’s output/cost). It is shown that two agents may use the same dither without adversely impacting on stability as long as they are sufficiently distant from each other. In addition, this is the first semi-global stability result for a NES scheme acting on a system with fairly general nonlinear dynamics. The convergence result is semi-global in the sense that the domain of attraction can be made arbitrarily large, assuming global asymptotic stability of the origin of the gradient system with robustness to small disturbances. Furthermore, it is demonstrated in simulation that, by re-using dither signals throughout the NES scheme, the proposed scheme is able to more quickly seek the Nash equilibrium. Although this work does not directly address the problem of global optimisation of a large-scale MISO system, through proper design of the cost functions, the Nash equilibrium of a decentralised problem can closely approximate the global optimum of the original problem (see [15] for example). This paper is organised as follows. Section II describes the system and the NES scheme. Section III outlines the main stability results, and is followed by a simulation example in Section IV. Finally, Section V concludes the paper and proposes further work. II. SYSTEM DESCRIPTION A. Plant The system to be optimised can be considered as a noncooperative game with Nu players (control agents), each trying to minimise its own cost function yi by controlling its input ui (the case where each agent control multiple inputs is left for future work). Specifically, the overall system is given by: x˙ = f (x, u)
(1a)
y = g(x, u)
(1b)
where x ∈ RNx and u, y ∈ RNu . The control objective is to drive the system to its optimal which is characterised by a Nash equilibrium. Assumption 1: There exists a function h : RNu → RNx such that x = h(u) is a globally asymptotically stable equilibrium of (1a), uniformly in u. In other words, given any constant input, the states of the system will asymptotically converge to an equilibrium point
which depends on the input. Let J(u) := g (h(u), u) be the input-to-output steady-state mapping of the system. Assumption 2: There exists a unique Nash equilibrium point u∗ which satisfies ∂Ji ∗ (u ) = 0 ∂ui
∂ 2 Ji ∗ (u ) > 0 ∂u2i
(2)
for all i ∈ {1, 2, . . . , Nu }. Each agent has a group of neighbour agents which are, in some sense, closest or adjacent to the agent. The set of the neighbours of an agent is denoted by Ni which contains the indices of the neighbour agents. Furthermore, a larger set of neighbours, which contains the neighbours of the neighbours (and so on), is represented by the set NiR , where the radius of the neighbourhood is denoted by R. Specifically, [ [ Nj ∪ (. . . ) NiR = Ni ∪ |
j∈Ni
= Ni ∪
{z
k∈Nj
cascade R times [ NjR−1 j∈Ni
}
(3)
where Ni1 = Ni . Assumption 3: For any (∆, γ) ∈ R2>0 , there exists R∗ ∈ Z>0 such that X ∂Ji ∗ sup (4) ∂uj ≤ γ, ∀R ≥ R ∗ ku−u k≤∆
j ∈N / iR
for all i ∈ {1, 2, . . . , Nu }. Thus, it is assumed that “distant” agents have a small effect to the cost function of an agent. In other words, the effect of an agent is felt strongly by direct neighbours and becomes increasingly weak as the agents considered are more distant. Assumption 4: There exist α1 , α2 ∈ K∞ , α3 , α4 ∈ K and a radially unbounded V (.) : RNu → R, such that for all e ∈ RNu , the following hold u α1 (ke uk) ≤ V (e u) ≤ α2 (ke uk)
(5a)
Nu X ∂V ∂Ji (e u) (e u + u∗ ) ≥ α3 (ke uk) ∂e u ∂u i i i=1
(5b)
k∇V (e u) k ≤ α4 (ke uk)
(5c)
This assumption provides global asymptotic stability of the origin of a system of the form ∂Ji de ui =− (e u + u∗ ), dτ ∂e ui
i = 1, 2, . . . , Nu
(6)
e and τ are the auxiliary system’s states and timewhere u scale respectively. Assumption 4 also ensures (6) has an asymptotically stable solution close to the origin when subjected to sufficiently small perturbations. This underpins the main convergence result for the proposed NES scheme, which is proven by showing that the NES scheme can be designed such that the behaviour of the full closed-loop system approximates (6) with arbitrary accuracy.
6078
u−i
x˙ = f (x, u) y = g(x, u)
ui u ¯i
R
a sin (ω [ωi t + φi ])
Consider the Taylor series expansion of J(¯ uav + as(τ )):
yi
J(¯ uav + as(τ )) = J(¯ uav ) + a∇J(¯ uav )s(τ ) + O(a2 ) (12) where
−kω sin (ω [ωi t + φi ])
∇J = ∇J1
...
∇JNu
T
(13)
Substituting (12) and (13) into (11) will yield
Fig. 1: Schematic of Nash Equilibrium Seeking for control agent i, where u−i is the action of other agents.
ZT
d¯ uav k i =− dτ T
(sin(ωi τ + ωφi )Ji (¯ uav )
0
B. Nash Equilibrium Seeking Scheme
+ a sin(ωi τ + ωφi )∇Ji (¯ uav )T s(τ ) 2 +O(a ) dτ (14)
As depicted in Figure 1, the Nash equilibrium seeking scheme is given by: ¯ + as(ωt)) x˙ = f (x, u ¯˙ = −kω diag (s(ωt)) g(x, u ¯ + as(ωt)) u
∇J2
(7a) (7b)
where s(ωt) is both the dither and demodulation signal sin (ω [ω1 t + φ1 ]) sin (ω [ω2 t + φ2 ]) (8) s(ωt) = .. . sin (ω [ωNu t + φNu ])
and diag(.) is a diagonal matrix where the diagonal element is given by the input vector. Assumption 5: There exist ωmin and ω(.) : Z>0 → T N ([ωmin , ∞) ∩ Q) u where ω = ω1 ω2 . . . ωNu . Furthermore, the dither frequency inside the neighbourhood of radius R is different, that is for all i ∈ {1, 2, . . . , Nu }, the dither frequency ωj 6= ωi for all j ∈ NiR III. ANALYSIS
The average of the first term is 0 and the average of the second term when ωi 6= ωj is also 0 due to orthogonality. Therefore, the average system can be simplified as d¯ uav i dτ
= −k
X
a 2 j:ω
j =ωi
∂Ji (¯ uav ) + O(a2 ) ∂uj
(15)
Under Assumption 3, it is clear that if R is sufficiently large (i.e. ωi is different from P ωj for enough agents), for all j such that ωj = ωi and i 6= j, j ∂Ji /∂uj is less than γ such that d¯ uav i = −k dτ
a ∂Ji 2 (¯ uav ) + O(aγ) + O(a ) 2 ∂ui
(16)
Thus, for sufficiently small a and γ, the behaviour of the averaged system is expected to approximate a gradient-based optimiser.
A. Reduced System First, a new time-scale is defined τ = ωt and the closed loop system (7) can be expressed in this new time-scale dx ¯ + as(τ )) ω = f (x, u (9a) dτ d¯ u ¯ + as(τ )) = −k diag (s(τ )) g(x, u (9b) dτ For small ω, the system states x will quickly settle to their equilibrium x = h(¯ u + as(τ )) and the reduced system is d¯ ur ¯ r + as(τ )) = −k diag (s(τ )) g(h(¯ ur + as(τ )), u dτ = −k diag (s(τ )) J(¯ ur + as(τ )) (10) B. Periodic Average of Reduced System For small k, a conclusion regarding the stability of the reduced system can be taken by investigating the periodic average of the Let T be the least common reduced system. 2π 2π 2π which always exists by Asmultiple of ω1 , ω2 , . . . , ωN u sumption 5. The average system is Z d¯ uav k T =− diag (s(τ )) J(¯ uav + as(τ )) dτ (11) dτ T 0
C. Stability Analysis of Averaged and Reduced Systems Before introducing the main result, it is useful to show the stability analysis of the averaged and the reduced systems. Since the stability analysis of the reduced system requires the stability of the averaged system, the result for the reduced system is presented after the result for the averaged system. The stability of the averaged system (15) is stated in the following Lemma. Lemma 1: Under Assumption 2–5, there exists βav ∈ KL such that for any (∆, ν) ∈ R2>0 there exist (a0 , γ0 ) ∈ R2>0 such that for all (a, k, γ) ∈ (0, a0 ] × R>0 × (0, γ0 ], the solutions of the average system (15) with the initial condition k¯ uav (0) − u∗ k ≤ ∆, will satisfy k¯ uav (τ ) − u∗ k ≤ βav (k¯ uav (0) − u∗ k, kaτ ) + ν,
∀τ ≥ 0
Proof: The Lyapunov function from Assumption 4 can be used to proof stability of the average system. First define
6079
Fav (¯ uav ) =
1 T
Z
0
T
diag (s(τ )) J(¯ uav + as(τ )) dτ
(17)
e=u ¯ av − u∗ and consider the Lyapunov function Now let u V (e u) where V (.) is given in Assumption 4. 1 dV = − ∇V (e u)T Fav (e u + u∗ ) k dτ Nu ∂V ∂Ji aX (e u) (e u + u∗ ) =− 2 i=1 ∂e ui ∂ui −
Nu X ∂Ji aX ∂V (e u) (e u + u∗ ) 2 i=1 ∂e ui ∂u j j:ω =ω j
i
j6=i ∗
T
∗
− ∇V (e u) (Fav (e u + u ) − G(e u + u ))
(18)
where Gi (e u + u∗ ) =
a X ∂Ji (e u + u∗ ) 2 j:ω =ω ∂uj j
(19)
i
for all i = 1, 2, . . . , Nu . Consider the first term in (18). Using (5b) from Assumption 4, −
Nu X ∂V ∂Ji (e u) (e u + u∗ ) ≤ −α3 (ke uk) ∂e u ∂u i i i=1
i
j6=i
≤ γk1kk∇V (e u)k ≤ γk1kα4 (ke uk)
(21)
Similarly, after using (5c) from Assumption 4 for the third term and substituting (20) and (21) into (18), a aγ 1 dV ≤ − α3 (ke uk) + k1kα4 (ke uk) k dτ 2 2 ∗ + α4 (ke uk)kFav (e u + u ) − G(e u + u∗ )k
(22)
The first term in the right hand side of (22) is negative definite of O(a), the second and the third terms are positive definite of O(aγ) and O(a2 ) respectively. Now let B(c) := {e u ∈ RNu : ke uk ≤ c} denote a ball of radius c and let D := {e u ∈ RNu : V (e u) ≤ α2 (∆)}. Using (5a) and realising that B (∆) ⊆ D, V (e u) must increase e is to leave D after being initialised in B (∆). It follows if u that there exists (a0 , γ0 ) ∈ R2>0 and αav ∈ K such that for all (a, γ) ∈ (0, a0 ] × (0, γ0 ] and ε = αav (a + γ), dV /dτ ≤ 0 e ∈ D − B(ε). Thus, for sufficiently small a and γ, V for all u is guaranteed to decrease with time until ke uk ≤ ε, in which case α2 (ε) ≥ α2 (ke uk) ≥ V (e u) ≥ α1 (ke uk) (23) and therefore ke uk ≤ α1−1 ◦ α2 (ε)
= α1−1 ◦ α2 ◦ αav (a + γ)
k¯ ur (τ ) − u∗ k ≤ βr (k¯ ur (0) − u∗ k, kaτ ) + ν,
(24)
∀τ ≥ 0
Proof: Let p(q, τ, a, k) represents the T -periodic (in τ ) solution of ∂p = kFav (q, a) − k diag (s(τ )) J(q + p + as(τ )) (25) ∂τ For any bounded domain D0 ∈ RNu containing u∗ , the solution of (25) will be p(q, τ, a, k) = O(k) uniformly in (q, τ ) ∈ D0 × R and small (a, k). Furthermore, there is a w such that p(w, τ, a, k) exists and satisfies ¯r w + p(w, τ, a, k) = u
(20)
Now consider the second term, using Assumption 3 and (5c) from Assumption 4 Nu Nu X X X ∂Ji ∂V ∂V − (e u) (e u + u∗ ) ≤ γ (e u ) ∂e ui j:ω =ω ∂uj ∂e ui i=1 i=1 j
e Lemma 1 follows after noting that the region to which u converges can be made arbitrarily small by decreasing a and γ. The stability of the reduced system (10) is stated in the following Lemma. Lemma 2: Under Assumption 2–5, there exists βr ∈ KL such that for any (∆, ν) ∈ R2>0 there exist (a0 , k0 , γ0 ) ∈ R3>0 such that for all (a, k, γ) ∈ (0, a0 ]×(0, k0 ]×(0, γ0 ], the solutions of the reduced system (10) with the initial condition k¯ ur (0) − u∗ k ≤ ∆, will satisfy
(26)
¯ r satisfying u ¯ r − u∗ ∈ for sufficiently small (a, k) and any u D0 . Differentiating (26) with respect to τ gives !−1 ∂p dw = −k I + Fav (w, a) (27) dτ ∂q q=w
Note that ∂p/∂q can be made arbitrarily small by decrasing k. Therefore, the solution of (27) approaches the behaviour of the averaged system. Hence, using similar arguments to Lemma 1, the Lyapunov function V (w − u∗ ) can be used to prove convergence of kw − u∗ k to a solution that can be made arbitrarily small by decreasing (a, k, γ). Lemma 2 follows after noting that the same conclusion can be taken ¯ r from (26). for u D. Main Result Before introducing the main stability result, define x − h(u) z= ¯ − u∗ u
(28)
Theorem 1: Under Assumptions 1–5, there exists β(., .) ∈ KL such that for any (∆, ν) ∈ R2>0 there exist (a∗ , k ∗ , R∗ ) ∈ R2>0 × Z>0 such that for all (a, k, R) ∈ (0, a∗ ] × (0, k ∗ ] × Z≥R∗ there is an ω ∗ ∈ R>0 such that for all ω ∈ (0, ω ∗ ], the solutions of (7) with any kz(t0 )k ≤ ∆ satisfy kz(t)k ≤ β (kz(t0 )k, kaω(t − t0 )) + ν,
∀t ≥ t0
Proof: The proof follows three steps. First, the average system (15) is shown to be SPA stable uniformly in small (a, γ) (see Lemma 1). Secondly, the reduced system (10) is proven to be SPA stable uniformly in (k, a, γ) (see Lemma 2). In addition, the stability of the boundary layer system is provided by Assumption 1. Finally, using a similar singular perturbation argument in [2, Lemma 1], the full
6080
closed-loop system can be proven to be SPA stable with respect to (k, a, γ, ω) uniformly in (k, a, γ). Theorem 1 follows after noting that the conclusion can be restated to incorporate R instead of γ (because R is the actual design parameter) by Assumption 3. Remark 1: Theorem 1 is technically not an SPA stability result since R is not continuous. However, it says that the full-closed loop system converges to a small neighbourhood of the Nash equilibrium with sufficiently small (k, a, ω) and sufficiently large R, where only (k, a, R) can be chosen independently. IV. SIMULATION EXAMPLE The aim of the simulation is to investigate the effect of the design parameter R on the convergence speed of the closed-loop system. It is expected that by reducing R up to a certain point, the convergence speed could be increased due to the smaller number of dither frequencies used. Beyond that point, it is expected that any given agent’s gradient estimate will become increasingly influenced by nearby agents with the same dither frequency. This could adversely affect the convergence speed, or even stability of the NES scheme. The advantage of dither re-use is the reduction of the range of time-scales occupied by the dither and gradient estimation task, thereby allowing the use of larger k. It is expected that the range of time-scales is not only affected by the individual dither frequencies chosen, but also the differences between any two dither frequencies. Since it is difficult to quantify these effects a priori, the influence of dither re-use is investigated through a set of numerical experiments. Consider a nonlinear map with first-order lag output dynamics J(u) − y (29) y˙ = 0.5 where u, y ∈ R11 are the input and output respectively. The neighbours of agent i are agents i−1 and i+1. An exception applies to the agents on the edges, u1 and u11 , which only have one neighbour each, u2 and u10 respectively. The cost function to be minimised J(u) is of the form T J(u) = C u21 u22 . . . u211 (30) where C ∈ R11×11 with ci,i = 1 and ci,j = −
1.6 e−|i−j|/1.2 |i − j|
∀i 6= j
(31)
where |i − j| represents the distance between agent i and j. Therefore, the Nash equilibrium of the system is located at the origin. The dither amplitude a and the time-scale constant ω are set as 1 for all simulations. Note that although ω is kept constant, the time-scale separation can be achieved by choosing sufficiently small ωi and k. The other design parameters, which are the main concerns in this simulation, are the adaptation gain k, the neighbourhood radius R and the agents’ dither frequencies ωi ’s. The maximum value of R in this problem is 5 where each agent has unique dither frequency. In order to focus
on the effect of R, it is sensible to fix the maximum dither frequency ωmax = 1.2 for all simulations. Moreover, an equal dither frequency spacing ∆ω is set such that for all i ∈ {1, 2, . . . , Nu } ωi = ωmax ωi = ωi−1 − ∆ω
if i ≡ 1 (mod 2R + 1) otherwise
(32)
Furthermore, to eliminate the dependency of the result on the choice of the initial condition, the results from twenty six different iterations of initial conditions are obtained. The settling time is defined as ts = maxm (tm s ) where th k¯ uk2 ≤ 5 for all t ≥ tm with the m initial condition. At s each R, an exhaustive optimisation is performed to find the values of ∆ω and k such that ts is minimised. The result is summarised in Table I. The globally optimal parameters for the R = 5 case are not robust against small changes in either parameter. A small change in ∆ω or k may lead to significantly worse performance or even instability. Thus, the details of a local minimum for the R = 5 case are also provided in Table I. Performance of the NES scheme near this local minimum is less severely affected by changes in ∆ω and k. TABLE I: The optimal ts found using different value of R R
∆ω
ωmin
k
ts
5 (global)
0.083
0.370
0.0075
407.24
5 (local)
0.070
0.500
0.0082
408.84
4
0.079
0.568
0.0086
380.07
3
0.130
0.420
0.0093
349.68
2
0.153
0.588
0.0096
348.86
From Table I, it is evident that the convergence speed can be improved by reducing R (given that the other parameters have been optimised) up to a certain point where the interaction between agents is still small. By reducing R to R = 3, the optimal ∆ω and k are increased and ultimately the convergence speeds up by 14% relative to the R = 5 case. For the R = 2 case, the aforementioned trend of the optimal parameters is still observed. However, further increase in ts is not significant. It is believed that the strong coupling between agents who shared the same dither frequency has adversely affected the convergence speed. The following conclusions can be made from the simulation result. Firstly, by re-using dither frequencies, it is possible to increase one of ωmin and ∆ω, or even both, which allows the use of larger k and, therefore, faster Nash equilibrium seeking. Secondly, too small an R adversely affects the accuracy of the gradient estimation, which could prevent any significant increase in convergence speed. Thus, there is an optimal R which is less than the maximum possible R which gives fastest convergence. Figure 2 shows an example of the system’s convergence to its Nash equilibrium. Therefore, it has been demonstrated that NES with dither reuse has the potential to achieve convergence faster than NES without dither re-use.
6081
demonstrate potential improvement in convergence speed by re-using dither frequency. Some extensions for future work include considering agents with multiple-inputs; cost function design to make the Nash equilibrium approximate the global optimum; combining this work with previous results to improve convergence speed (see [6] for example); and deployment of the controller on practical/real-world problems.
50
40
k¯ uk2
30
20
VI. ACKNOWLEDGMENTS The authors gratefully acknowledge the contribution of reviewers.
10
0
R EFERENCES 0
100
200
300
400
500
300
400
500
t (s)
1200 1000
kyk2
800 600 400 200 0
0
100
200 t (s)
Fig. 2: Convergence of k¯ uk2 and kyk2 with R = 3 and optimised ∆ω and k.
V. CONCLUSIONS AND FUTURE WORKS Semi-global stability of the Nash equilibrium with respect to the design parameters was demonstrated for the proposed control scheme applied to a nonlinear MIMO system satisfying certain conditions. These conditions are consistent with existing ES results, but one extra condition is required that guarantees the effect of an agent’s input on another agent’s cost function decays to a sufficiently small value as the distance between them is increased. This allows any given dither frequency to be used by multiple agents if those agents have a sufficiently weak effect on each other’s steady-state cost function. Simulations were presented that
[1] K. B. Ariyur and M. Krsti´c, Real time optimization by extremum seeking control. Hoboken, NJ : Wiley Interscience, 2003. [2] Y. Tan, D. Neˇsi´c, and I. Mareels, “On non-local stability properties of extremum seeking control,” Automatica, vol. 42, no. 6, pp. 889 – 903, 2006. [3] D. Neˇsi´c, Y. Tan, and I. Mareels, “On the choice of dither in extremum seeking systems: A case study,” in Proceedings of the 45th IEEE Conference on Decision and Control, Dec. 2006, pp. 2789 –2794. [4] Y. Tan, D. Neˇsi´c, I. Mareels, and A. Astolfi, “On global extremum seeking in the presence of local extrema,” Automatica, vol. 45, no. 1, pp. 245 – 251, 2009. [5] W. Moase, C. Manzie, and M. Brear, “Newton-like extremum-seeking for the control of thermoacoustic instability,” IEEE Transactions on Automatic Control, vol. 55, no. 9, pp. 2094 –2105, Sept. 2010. [6] W. H. Moase and C. Manzie, “Semi-global stability analysis of observer-based extremum-seeking for Hammerstein plants,” IEEE Transactions on Automatic Control, vol. 57, no. 7, pp. 1685–1695, 2011. [7] J. C. Spall, “An overview of the simultaneous perturbation method for efficient optimization,” Johns Hopkins APL Technical Digest, vol. 19, no. 4, pp. 482–492, 1998. [8] K. Ariyur and M. Krsti´c, “Analysis and design of multivariable extremum seeking,” in Proceedings of the 2002 American Control Conference, vol. 4, 2002, pp. 2903–2908. [9] B. Srinivasan, “Real-time optimization of dynamic systems using multiple units,” International Journal of Robust and Nonlinear Control, vol. 17, no. 13, pp. 1183–1193, 2007. [10] W. Moase, Y. Tan, D. Neˇsi´c, and C. Manzie, “Non-local stability of a multi-variable extremum-seeking scheme,” in Proceedings of the 2011 Australian Control Conference (AuCC), Nov. 2011, pp. 38–43. [11] M. Stankovi´c, K. Johansson, and D. Stipanovi´c, “Distributed seeking of Nash equilibria in mobile sensor networks,” in Proceedings of the 49th IEEE Conference on Decision and Control, Dec. 2010, pp. 5598 –5603. [12] P. Frihauf, M. Krsti´c, and T. Bas¸ar, “Nash equilibrium seeking in noncooperative games,” IEEE Transactions on Automatic Control, vol. 57, no. 5, pp. 1192–1207, 2011. [13] S.-J. Liu and M. Krsti´c, “Stochastic Nash equilibrium seeking for games with general nonlinear payoffs,” SIAM Journal on Control and Optimisation, vol. 49, no. 4, pp. 1659–1679, 2011. [14] P. Frihauf, M. Krsti´c, and T. Bas¸ar, “Nash equilibrium seeking for dynamic systems with non-quadratic payoffs,” in Proceedings of the 18th IFAC World Congress, Aug. 2011, pp. 3605 – 3610. [15] B. Avi-Itzhak, B. Golany, and U. G. Rothblum, “Strategic equilibrium versus global optimum for a pair of competing servers,” Journal of Applied Probability, vol. 43, no. 4, pp. 1165–1172, 2006.
6082