Convergent Dynamics of Nonreciprocal Differential ... - Semantic Scholar

Report 2 Downloads 55 Views
CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

1

Convergent Dynamics of Nonreciprocal Differential Variational Inequalities Modeling Neural Networks Mauro Di Marco, Mauro Forti, Massimo Grazzini, Luca Pancioni

Abstract—The paper addresses convergence of solutions for a class of differential inclusions termed differential variational inequalities (DVIs). Each DVI describes the dynamics of a neural network (NN) evolving in a closed hypercube of Rn and defined by a continuously differentiable, cooperative and (possibly) nonreciprocal vector field f . The main result in the paper is that under a new condition on f , which is called strong Kamke-Muller ¨ condition, the solution semiflow generated by the DVI is strongly order preserving (SOP) and hence it satisfies a L IMIT S ET D ICHOTOMY and enjoys generic convergence properties. A characterization of the SKM condition is given in terms of the interconnection properties of the Jacobian matrix of f . In the case where f is an affine, or a linear, vector field the considered DVIs include two relevant classes of NNs, namely, the linear systems operating on a closed hypercube, also known as linear systems in saturated mode (LSSMs), and the full-range (FR) model of cellular neural networks (CNNs). By applying the results to LSSMs it is obtained that any cooperative LSSM with a (possibly) nonsymmetric and fully interconnected matrix is generically convergent. Analogous results hold for FRCNNs. All the obtained convergence results hold in the general case where the DVIs, and the LSSMs and FRCNNs, possess multiple equilibrium points. Index Terms—Neural networks; convergence; cooperative dynamical systems; Kamke-Muller ¨ condition; limit set dichotomy; differential variational inequalities.

I. I NTRODUCTION Relevant applications to signal processing tasks in the field of pattern recognition, decision making, associative memories, image processing and combinatorial optimization, require that a neural network (NN) possesses multiple stable equilibrium points (EPs) and each solution converges toward an EP depending on the initial conditions. This property is referred to in the literature as convergence (of NNs with multiple EPs) or multistability of NNs [1]–[15]. In the first studies convergence of NNs with multiple EPs was guaranteed by imposing that the neuron interconnection matrix satisfies some symmetry condition [2], [3], [16]. One main problem with this approach is that, due to tolerances in the electronic implementation and element variations, the symmetry condition can never be satisfied exactly in practice [17], [18]. It has also been shown that there are classes of NNs exhibiting nonconvergent and complex dynamics even arbitrarily close to symmetry, as in the NN with three competing neurons modeling the Voting Paradox [19]. Moreover, there Manuscript received —; revised —. The authors are with the Dipartimento di Ingegneria dell’Informazione, Universit`a di Siena, 53100 Siena, Italy (e-mail: {dimarco, forti, grazzini, pancioni}@dii.unisi.it). Copyright (c) 2013 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to [email protected].

are applications where drastically nonsymmetric interconnection matrices are deliberately used [17], [20]. Therefore, an important problem is to discover conditions for convergence of classes of NNs with multiple EPs and with nonsymmetric interconnection matrices. The conditions should ensure not only convergence for the nominal interconnections but also for perturbations of the nominal interconnections. This paper addresses convergence of solutions for a class of differential inclusions termed differential variational inequalities (DVIs) [21]–[23]. Any DVI in this class describes the dynamics of an NN with multiple EPs, evolving in a closed hypercube of the state space, and defined by a continuously differentiable (possibly) nonreciprocal vector field f . Such DVIs include the relevant class of NNs modeled by linear systems operating in a closed hypercube, also known as linear systems in saturated mode (LSSMs). The LSSMs have been introduced by Li, Michel, and Porod [24] in order to remedy some shortcomings of the Hopfield NNs [16]. Indeed, an LSSM is easier to electronically implement and can lead to more effective design procedures, in the field of associative memories, with respect to Hopfield NNs [24]. In an LSSM f is an affine or a linear vector field defined by the neuron interconnection matrix A of the LSSM. Non-reciprocity of f means that A is nonsymmetric. The DVIs also include the ideal full-range (FR) model of cellular neural networks (CNNs), introduced by Rodr´ıguez-V´azquez et al. [25], which is at the core of the CNN universal machine [26] and the family of ACE processors implemented in CMOS technology [25], [27]. In the paper we consider the relevant case where f is cooperative or, equivalently, it satisfies the Kamke-M¨uller condition in an open and convex set containing the hypercube [28]. In this case it has been shown in [29] that the semiflow generated by the DVI is monotone and that monotonicity implies some restrictions on the geometry of omega-limit sets of the solutions. However, since we are interested in convergence of solutions, it is known from the monotone dynamical system theory that stronger properties with respect to monotonicity, as the property that the semiflow is strongly order preserving (SOP), are typically needed [28], [30]. To ensure the SOP property and address convergence of DVIs we introduce a new assumption, which we call strong Kamke-M¨uller (SKM) condition. The main results obtained in the paper under the SKM condition are summarized as follows. The first type of results, which are given in Section IV, concerns the relationship between the SKM condition and the SOP property and the characterization of the SKM condition. • It is shown that if f satisfies the SKM condition then the semiflow of the DVI is SOP (Theorem 1). • It is shown that the SKM condition is equivalent to

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

a full interconnection property of the Jacobian of f (Propositions 1, 2 and 3). • Relevant classes of DVIs that satisfy the SKM condition are singled out (Corollaries 1, 2). By using the main bulk of the theory for SOP semiflows developed in [28] the paper then establishes the following main results on generic convergence in Section V. • It is shown that under the SKM condition, as a consequence of the SOP property, the semiflow of a DVI satisfies the L IMIT S ET D ICHOTOMY (Theorem 2) and it is almost quasi-convergent or almost convergent if in addition the EPs are isolated (Theorems 3, 4, 5 and Corollary 3). By applying the results to NNs modeled by DVIs we obtain the next results in Section VI. • Any nonsymmetric, cooperative and fully interconnected LSSM is almost quasi-convergent or almost convergent (Theorem 6). Analogous convergence results are obtained for FRCNNs (Theorem 7). To the authors’ knowledge all the obtained results are original. In particular, Theorems 6, 7 are the only existing results on convergence of nonreciprocal DVIs, and nonsymmetric LSSMs and FRCNNs, possessing multiple EPs. In the paper Theorems 6, 7 are compared with existing results on convergence and multistability of DVIs, LSSMs, and FRCNNs, obtained under different assumptions on the interconnections. The paper also highlights significant differences between the convergence results for NNs modeled by cooperative DVIs and those obtained in the standard setting of cooperative NNs modeled by means of cooperative ordinary differential equations (ODEs) [4], [17]. It is stressed that the condition of full interconnection in Theorem 6 corresponds to the standard assumption made for Hopfield-type NNs and LSSMs [16], [24], and that it is robust with respect to small perturbations of the interconnections, an extremely desirable property in view of the electronic implementation. Instead, Theorem 7 is mainly of theoretic interest since FRCNNs are typically only locally connected. The organization of the paper is outlined as follows. Section II gives the needed preliminary results, whereas Section III describes the DVI model analyzed in the paper. Then, Sections IV, V establish the main results on the SKM condition, the SOP property and generic convergence of the DVI. Generic convergence of LSSMs and FRCNNs is addressed in Section VI and the main conclusions are collected in Section VII. P Notation: By kxk = ( ni=1 x2i )1/2 we denote the Euclidean norm of x ∈ Rn , whereas kxk∞ = maxi=1,2,...,n |xi | is the infinity-norm of x ∈ Rn . Moreover, B(x, r) = {y ∈ Rn : ky−xk < r} is the open ball with radius r centered at x ∈ Rn , and B ∞ (x, r) = {y ∈ Rn : ky − xk∞ ≤ r} is the closed ball in the infinity norm withP radius r around x ∈ Rn . If x, y ∈ Rn , n we denote by hx, yi = i=1 xi yi the scalar product of x and y. Given U, V ⊂ Rn , with dist(U, V ) = inf x∈U,y∈V kx − yk we mean the distance between sets U, V . If U ⊂ Rn , µ(U ) denotes the Lebesgue measure of U . If f : Rn → Rm , f ∈ C 1 (Rm ), by Df (x) = (∂fi (x)/∂xj ) we denote the n × m Jacobian matrix of f at x ∈ Rn .

2

II. P RELIMINARIES A. Monotone Semiflows In this section we collect definitions and properties of monotone semiflows needed in the paper. We refer the reader to [28], [30] for a thorough treatment. We consider Rn as a partially ordered space with the following order relation. If x, y ∈ Rn , we let x ≤ y if xi ≤ yi , . i ∈ {1, 2, . . . , n} = N . Moreover, x < y if x ≤ y, x 6= y and x ≪ y if xi < yi , i ∈ N . If U, V ⊂ Rn , by U ≤ V (U < V ) we mean that x ≤ y (x < y) for any x ∈ U , y ∈ V . Let K = [−1, 1]n = {x ∈ Rn : kxk∞ ≤ 1} be the closed unit hypercube in Rn . A semiflow on K is a continuous map Φ : [0, +∞) × K → K, (t, x) → Φt (x), such that Φ0 (x) = x, Φt (Φs (x)) = Φt+s (x), for any t, s ≥ 0 and x ∈ K. The omega-limit set, ω(x), of x ∈ K is the set of points y ∈ K for which there exists a sequence tk → +∞, as k → +∞, such that Φtk (x) → y. Since the positive orbit through x, {Φt (x) : t ∈ [0, +∞)}, has compact closure, ω(x) is a nonempty, compact, connected subset of K. An EP of Φ is a point ξ ∈ K satisfying Φt (ξ) = ξ, t ≥ 0. We denote by E ⊂ K the set of EPs of Φ. The semiflow Φ on K is said to be: • monotone if for any x, y ∈ K such that x ≤ y we have Φt (x) ≤ Φt (y), t ≥ 0; • strongly monotone (SM) if Φ is monotone and for any x, y ∈ K such that x < y we have Φt (x) ≪ Φt (y), t > 0; • eventually strongly monotone (ESM) if Φ is monotone and for any x, y ∈ K such that x < y there exists t¯ > 0 such that we have Φt (x) ≪ Φt (y), t > t¯; • strongly order preserving (SOP) if Φ is monotone and for any x, y ∈ K such that x < y there exist open neighborhoods U, V of x and y, respectively, and t¯ ≥ 0 such that Φt¯(U ∩ K) ≤ Φt¯(V ∩ K). If Φ is ESM, then Φ is SOP, but the converse is not true in general. A point x ∈ K is said to be quasi-convergent if we have ω(x) ⊂ E, i.e., Φt (x) approaches the set of EPs as t → +∞. A point x ∈ K is said to be convergent if we have ω(x) = {ξ} ⊂ E or, equivalently, limt→+∞ Φt (x) = ξ ∈ E. If the EPs are isolated, then quasi-convergence is known to be equivalent to convergence. We denote by Q the set of quasi-convergent points of K and by C the set of convergent points of K. The semiflow Φ is said to be: • almost quasi-convergent if µ(K\Q) = 0, i.e., the generic point is quasi-convergent; • almost convergent if µ(K\C) = 0, i.e., the generic point is convergent; • quasi-convergent if K = Q, i.e., each point is quasiconvergent; • convergent if K = C, i.e., each point is convergent. Analogous definitions and properties hold for semiflows in Rn provided that each point has compact orbit closure.

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

3

III. N EURAL N ETWORK M ODEL

B. Cooperative ODEs

In the paper we analyze the dynamics of the class of differential inclusions, also termed DVIs [12], [21], [22],

Consider the autonomous system of ODEs x˙ = F (x)

(1)

where x ∈ Rn and F ∈ C 1 (Rn ). Given x ∈ Rn we denote by x(t) the unique solution of (1) with initial condition x at t = 0 and assume that, for any x ∈ Rn , x(t) is bounded and hence defined for any t ≥ 0. The semiflow Φ : Rn × [0, +∞) → Rn generated by (1) is the collection of mappings Φt (x) = x(t) for any x ∈ Rn and t ≥ 0. Definition 1 ([28]): We say that F satisfies the KamkeM¨uller (KM) condition in Rn if for each a, b ∈ Rn , and i ∈ N , such that a ≤ b and ai = bi , we have Fi (a) ≤ Fi (b). If F satisfies the KM condition in Rn then Φ is monotone [28]. The KM condition can be expressed in terms of the sign structure of the Jacobian DF . Indeed, F satisfies the KM condition in Rn if and only if F is cooperative in Rn , i.e., we have ∂Fi (x) ≥ 0, ∀i 6= j, ∀x ∈ Rn . ∂xj We say that F is irreducible in Rn if the Jacobian DF (x) is irreducible for any x ∈ Rn . One main result in [28], [30], [31] is that if F is cooperative and irreducible in Rn then Φ is SM (hence Φ is ESM and SOP). C. Normal and Tangent Cones Let W be a nonempty, closed, convex subset of Rn . The tangent cone to W at x ∈ W is defined as [32], [33]   dist(x + ρv, W ) n TW (x) = v ∈ R : lim inf =0 . ρ ρ→0+ The normal cone to W at x ∈ W is given by NW (x) = {p ∈ Rn : hp, vi ≤ 0, ∀v ∈ TW (x)}. For any x ∈ W , TW (x) and NW (x) are nonempty closed convex cones in Rn . Property 1 ([22]): If W coincides with the hypercube K, then for any x ∈ K we have NK (x) = Λ(x) = (λ(x1 ), . . . , λ(xn ))′ , where the prime means the transpose and   (−∞, 0], ρ = −1 0, ρ ∈ (−1, 1) λ(ρ) =  [0, +∞) , ρ = 1.

For any v ∈ Rn , the projection of v on W is the unique point PW (v) ∈ W satisfying [21, Cor. 1, p. 23] kv − PW (v)k = dist(v, W ) = min ky − vk. y∈W

If x ∈ K, and W = TK (x), k = 1, 2, . . . , n [29]   vk , PTK (x),k (v) = vk ,  0,

we have for any v ∈ Rn and |xk | < 1 |xk | = 1, xk vk ≤ 0 |xk | = 1, xk vk > 0.

x˙ ∈ f (x) − NK (x)

(2)

where x ∈ Rn , K is the closed unit hypercube in Rn , NK (x) is the normal cone to K at point x ∈ K, f : U → Rn is a continuously differentiable vector field in U and U is an open and convex subset of Rn such that K ⊂ U . The vector field f is in general nonreciprocal, i.e., ∂fi (x)/∂xj 6= ∂fj (x)/∂xi for i 6= j. The DVI (2) describes the dynamics of a NN, defined by the smooth vector field f , evolving in the closed hypercube K of Rn . Two important classes of NNs included in the DVI model (2) are discussed in Section VI. The DVI (2) is a relevant special case of the general DVI introduced in [12] for modeling a broad class of NNs, evolving in a compact convex subset of Rn (not necessarily a hypercube), and defined by a (possibly) nonsmooth (nondifferentiable) vector field f . A related NN whose dynamics can be modeled by a DVI has been proposed in [34] for solving in real time a class of linear and quadratic programming problems. The original concept of differential inclusions in the form of DVIs has been developed by Aubin and Cellina in [21]. This concept was later extended to provide a general framework for modeling problems simultaneously involving dynamics, inequalities, and discontinuities, as those typically found in hybrid systems and variable structure systems [23]. A solution x(t), t ≥ 0, of the DVI (2) is an absolutely continuous function x(·) on any compact interval in [0, +∞) such that x(t) ∈ K for any t ≥ 0 and x(t) ˙ ∈ f (x(t)) − NK (x(t)) for almost all (a.a.) t ∈ [0, +∞). It has been shown in [29] that, given any x ∈ K, there exists a unique solution x(t), t ≥ 0, of the DVI (2) with initial condition x(0) = x. Moreover, x(t), t ≥ 0, is a solution of (2) if and only if it satisfies the projected differential equation x(t) ˙ = PTK (x(t)) (f (x(t))),

for a.a. t ≥ 0.

(3)

It is also shown in [29] that there is at least an EP of (2). We denote by E the set of EPs of (2). The collection of mappings Φt (x) = x(t), for any x ∈ K and t ≥ 0, is the continuous semiflow on K generated by (2). We remark that the DVI (2) generates a semiflow in the closed subset K of Rn , whereas an ODE typically generates a semiflow in Rn or in an open subset of Rn . Another significant difference with respect to the ODEs is that the semiflow of (2) is in general not injective, i.e., there can exist points x 6= y ∈ K and an instant t¯ such that Φt¯(x) = Φt¯(y) [29]. IV. SOP S EMIFLOWS OF C OOPERATIVE DVI S In this section we study monotonicity, strong monotonicity, and the SOP property of the semiflow Φ generated by the DVI (2). The main result is a sufficient condition on the vector field f defining the DVI ensuring that Φ is SOP (Theorem 1). In Section IV-A we first briefly review previous results on monotonicity of the semiflow Φ of the DVI in the case where f is cooperative and also negative results on the eventual strong monotonicity and the SOP property of Φ when f

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

is cooperative and irreducible. Then, in Section IV-B we introduce a new condition on f , which we call strong KamkeM¨uller (SKM) condition, enabling us to prove that Φ is SOP. A characterization of the SKM condition in terms of the interconnection properties of the Jacobian of f is obtained. In Section IV-C we apply the obtained results to the relevant class, in view of the applications to NNs, where f is an affine, or a linear, vector field. A. Monotonicity Consider first monotonicity of the semiflow Φ generated by the DVI (2). It has been proved in [29] that Φ is monotone if f satisfies the KM condition in U , i.e., for any a, b ∈ U and i ∈ N , such that a ≤ b and ai = bi , we have fi (a) ≤ fi (b). Once more the KM condition can be expressed in terms of the sign structure of the Jacobian Df of f , namely f satisfies the KM condition in U if and only if f is cooperative in U , i.e., ∂fi (x) ≥ 0, ∂xj

∀i 6= j,

∀x ∈ U.

A comparison with Section II-B shows that the results on monotonicity of the semiflow of the cooperative DVI (2) are completely analogous to those of the cooperative ODE (1). Consider now strong monotonicity and the SOP property of the semiflow Φ of (2). Suppose that f is irreducible in U , i.e., the Jacobian Df (x) is an irreducible matrix for any x ∈ U . In [29] it has been shown, by means of counterexamples, that there exist DVIs (2) with cooperative and irreducible vector fields for which the generated semiflow Φ is not SOP (hence Φ is not ESM nor SM). Moreover, an example is also presented of a DVI with a cooperative and irreducible vector field for which the L IMIT S ET D ICHOTOMY is violated. Those results show that, differently from the case of semiflows generated by ODEs, the irreducibility of the Jacobian of the vector field is no longer sufficient for establishing the SOP property and the L IMIT S ET D ICHOTOMY of semiflows generated by cooperative DVIs (cf. Section II-B). B. Strong Kamke-M¨uller Condition and SOP Property On the basis of the discussion in Section IV-A, in order to obtain conditions ensuring that the semiflow Φ of the DVI (2) is SOP, we are led to consider stronger assumptions on the vector field f with respect to the irreducibility of the Jacobian. To this end we introduce the following strong form of KM condition. Definition 2: We say that f satisfies the strong KamkeM¨uller (SKM) condition in U if for each a, b ∈ U , and i ∈ N , such that a < b and ai = bi , we have fi (a) < fi (b). Theorem 1: Suppose that f satisfies the SKM condition in U . Then, the semiflow Φ of the DVI (2) is SOP on K. Proof: The case n = 1 is obvious, so we suppose henceforth n > 1. Let x < y ∈ K. If x ≪ y, then we can find r > 0 such that B(x, r) ∩ K ≤ B(y, r) ∩ K and monotonicity of Φ implies Φt (B(x, r) ∩ K) ≤ Φt (B(y, r) ∩ K) for any t ≥ 0. Hence, it suffices to consider the case where I = {i ∈ N : xi = yi } 6= ∅.

4

Note that I = 6 N (otherwise x = y). Let ℓ0 = {i ∈ I : |xi | < 1}, ℓ+ = {i ∈ I : xi = 1} and ℓ− = {i ∈ I : xi = −1}, where {ℓ0 , ℓ+ , ℓ− } is a partition of I. Since f verifies the SKM condition in U , then f satisfies the KM condition in U and so f is cooperative. Moreover, since f is continuous, we can find η, ρ > 0 such that fi (v) ≥ fi (u)+η for any i ∈ I, u ∈ B ∞ (x, ρ) ∩ K and v ∈ B ∞ (y, ρ) ∩ K. Moreover, ρ can be chosen such that for any u ∈ B ∞ (x, ρ), v ∈ B ∞ (y, ρ) we have ui < vi if i ∈ / I, ui , vi > −1 if i ∈ I\ℓ− and ui , vi < 1 if i ∈ I\ℓ+ . Let

   ϑ = max f B ∞ (x, ρ) ∩ K ∪ f B ∞ (y, ρ) ∩ K ∞ . Since f is continuous and B ∞ (x, ρ) ∩ K, B ∞ (y, ρ) ∩ K are nonempty, compact subsets of U , it follows that ϑ is well defined. In particular, we have ϑ > 0. Indeed, let u ∈ B ∞ (x, ρ) ∩ K. If f (u) 6= 0, then ϑ ≥ kf (u)k∞ > 0. Otherwise, if f (u) = 0, pick any v ∈ B ∞ (y, ρ)∩K and i ∈ I. Then, we have ϑ ≥ |fi (v)| ≥ fi (v) ≥ fi (u) + η = η > 0. Let   η ρ δ = min ρ; >0 (4) 2ϑ + η 2 and τδ

=

sup{θ ≥ 0 : Φt (B ∞ (x, δ) ∩ K) ⊂ B ∞ (x, ρ), Φt (B ∞ (y, δ) ∩ K) ⊂ B ∞ (y, ρ), t ∈ [0, θ]}

where τδ is well defined and we have . ρ−δ τδ ≥ τm = > 0. (5) ϑ Indeed, suppose for purpose of contradiction that τδ < τm . Then, we can find u ∈ B ∞ (x, δ) ∩ K such that kΦτδ (u) − xk∞ = ρ or v ∈ B ∞ (y, δ) ∩ K such that kΦτδ (v) − yk∞ = ρ. In the former case we R τ have ρ = kΦτδ (u) − xk∞ ≤ kΦτδ (u) − ˙ t (u)k∞ dt+δ < ϑτm +δ = ρ, which uk∞ +ku−xk∞ ≤ 0 δ kΦ is a contradiction. The latter case can be dealt with similarly. Thus, Φt (B ∞ (x, δ)∩K) ⊂ B ∞ (x, ρ)∩K and Φt (B ∞ (y, δ)∩ K) ⊂ B ∞ (y, ρ) ∩ K for all t ∈ [0, τm ]. Fix any a ∈ B(x, δ) ∩ K and b ∈ B(y, δ) ∩ K and consider the function ∆(t) = Φt (b) − Φt (a), t ≥ 0. To complete the proof it is enough to show that ∆k (τm ) ≥ 0 for any positive integer k. To this aim, we find it useful to note the following facts. i) Since B(x, δ) ⊂ B ∞ (x, δ) and B(y, δ) ⊂ B ∞ (y, δ), we have Φt (a) ∈ B ∞ (x, ρ) ∩ K and Φt (b) ∈ B ∞ (y, ρ) ∩ K for any t ∈ [0, τm ]. ii) By the projected differential equation (3), for any positive integer k and for a.a. t ∈ [0, τm ] we have ˙ k (t) = ∆ =

˙ t,k (b) − Φ˙ t,k (a) Φ PTK (Φt (b)),k (f (Φt (b))) − PTK (Φt (a)),k (f (Φt (a))).

iii) Let t1 , t2 ∈ [0, τm ] and k ∈ I be such that Φt,k (b), Φt,k (a) ∈ (−1, 1) for any t ∈ (t1 , t2 ). Thus, for a.a. t ∈ [t1 , t2 ], ˙ k (t) = ∆ =

PTK (Φt (b)),k (f (Φt (b))) − PTK (Φt (a)),k (f (Φt (a))) fk (Φt (b)) − fk (Φt (a)) ≥ η.

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

Recalling (5) and (4), it follows that Z t2 ˙ k (s) ds ≥ ∆k (t1 ) + η(t2 − t1 ). ∆k (t2 ) = ∆k (t1 ) + ∆ t1

Now, we are in position to complete the proof. We distinguish the following four cases. 1) k ∈ / I. Since Φτm (a) ∈ B ∞ (x, ρ) ∩ K and Φτm (b) ∈ B ∞ (y, ρ) ∩ K, we have ∆k (τm ) = Φτm ,k (b) − Φτm ,k (a) > 0. 2) k ∈ ℓ0 . Then, we have Φt,k (b), Φt,k (a) ∈ (−1, 1) for any t ∈ [0, τm ]. Point iii) and (4) imply ρ−δ ϑ 2η η η η ≥ − ρ+ ρ− ρ = 0. 2ϑ + η ϑ ϑ 2ϑ + η

∆k (τm ) ≥ ∆k (0) + ητm ≥ −2δ + η

3) k ∈ ℓ+ . If Φτm ,k (b) = 1, then clearly ∆k (τm ) ≥ 0. So, we can assume Φτm ,k (b) < 1. Let τ ∗ = inf{θ ∈ [0, τm ] : Φt,k (b) < 1, t ∈ [θ, τm ]}. Note that τ ∗ is well defined and we have 0 ≤ τ ∗ < τm . Suppose that τ ∗ = 0. Then, we have Φt,k (b) ∈ (−1, 1) for any t ∈ (0, τm ]. Thus, Φ˙ t,k (b) = fk (Φt (b)) ≥ fk (uM ) + η for any t ∈ (0, τm ], where uM is such that fk (uM ) = max{fk (u), u ∈ B ∞ (x, ρ) ∩ K}. Hence, by (4) and (5) > Φτm ,k (b) ≥ 1 − δ + (fk (uM ) + η)τm    ρ−δ η ϑ = 1 + fk (uM ) + ρ−δ 1+ ϑ ϑ η ρ η ϑ ≥ 1 + fk (uM ) + ρ 2ϑ ϑ 2ϑ + η ρ > 1 + fk (uM ) 2ϑ ˙ t,k (a) ≤ fk (uM ) < 0 for i.e., fk (uM ) < 0. It follows that Φ a.a. t ∈ [0, τm ], i.e., Φt,k (a) ∈ (−1, 1) for any t ∈ (0, τm ]. Thus, arguing as in case 2), we conclude that ∆k (τm ) ≥ 0. Suppose instead that τ ∗ > 0. Then, we have Φτ ∗ ,k (b) = 1 and Φt,k (b) ∈ (−1, 1) for any t ∈ (τ ∗ , τm ]. This means that fk (Φτ ∗ ,k (b)) ≤ 0. As a consequence, we have fk (Φτ ∗ ,k (a)) ≤ fk (Φτ ∗ ,k (b))−η ≤ −η. In turn, this implies Φt,k (a) ∈ (−1, 1) in a whole right neighborhood of τ ∗ . Let 1

τ ′ = sup{θ ∈ [τ ∗ , τm ] : Φt,k (a) < 1, t ∈ (τ ∗ , θ]}. Note that τ ′ is well defined and τ ∗ < τ ′ ≤ τm . We have Φt,k (b), Φt,k (a) ∈ (−1, 1) for any t ∈ (τ ∗ , τ ′ ). Moreover, ∆k (τ ∗ ) ≥ 0. If we had τ ′ < τm , then, due to point iii), we would have Φτ ′ ,k (b) = Φτ ′ ,k (a) + ∆k (τ ′ ) ≥ 1 + ∆k (τ ∗ ) + η(τ ′ − τ ∗ ) > 1, which is a contradiction. Thus, τ ′ = τm and Φt,k (b), Φt,k (a) ∈ (−1, 1) for any t ∈ (τ ∗ , τm ). By point iii) we obtain ∆k (τm ) = ∆k (τ ∗ ) + η(τm − τ ∗ ) > ∆k (τ ∗ ) ≥ 0. 4) k ∈ ℓ− . By an argument analogous to that in the previous point 3) we obtain ∆k (τm ) ≥ 0. The next result gives a necessary and sufficient condition on Df in order that f satisfies the SKM condition. Proposition 1: Let   ∂fi Zij = z ∈ U : (z) = 0 (6) ∂xj

5

for any i, j ∈ N , i 6= j. Then, f satisfies the SKM condition in U if and only if f is cooperative in U and, for any z ∈ Zij , there exists a sequence {hk }, with hk > 0 and hk → 0 as k → +∞, such that z + hk eˆj ∈ / Zij , where eˆj is the j-th element of the canonical basis of Rn . Proof: First, suppose that f is cooperative in U and that for any z ∈ Zij there exists a sequence {hk }, with hk > 0 and hk → 0 as k → +∞, such that z + hk eˆj ∈ / Zij . Let x, y ∈ U with x < y and xi = yi for some i ∈ N . We want to show that fi (y) > fi (x). Since x < y and xi = yi , there exists a proper subset M of N such that xj < yj for each j ∈ M . Of course, M = {m1 , m2 , . . . , mℓ } for some sequence m1 < m2 < . . . < mℓ ∈ N \{i}. Let us introduce the sequence of points defined by recurrence as  x, j=0 uj = uj−1 + eˆmj (ymj − xmj ), j ∈ {1, 2, . . . , ℓ}. Note that uℓ = y. Also consider the piecewise affine curve ϕ : [0, ℓ] → K connecting points u0 , u1 , . . . , uℓ . Since fi (x) = fi (ϕ(0)), fi (y) = fi (ϕ(ℓ)), the map fi ◦ ϕ is absolutely continuous on [0, ℓ] and ∂fi (ϕ(s))/∂xj ≥ 0 for any i 6= j and s ∈ [0, ℓ], we have Z ℓ fi (y) = fi (x) + hDfi (ϕ(s)), ϕ(s)i ˙ ds 0

=

fi (x) +

n Z X j=1

j j−1

∂fi (uj−1 + ∂xmj

+ (s − j + 1)(ymj − xmj )ˆ emj )(ymj − xmj ) ds ≥

fi (x) + (ym1 − xm1 ) · Z 1 ∂fi · (x + s(ym1 − xm1 )ˆ emj ) ds 0 ∂xm1

where ym1 − xm1 > 0. Moreover, we can find a sequence {sk } ⊂ (0, 1) with sk → 0 as k → +∞ such that ∂fi (x + sk (ym1 − xm1 )ˆ e m1 ) > 0 ∂xm1 for any positive integer k. Due to the continuity of ∂fi /∂xm1 , for each positive integer k there exists an open interval Jk ⊂ [0, 1] such that sk ∈ Jk and ∂fi (x + s(ym1 − xm1 )ˆ e m1 ) > 0 ∂xm1 for any s ∈ Jk . It follows that fi (y) − fi (x)



(ym1 − xm1 ) · Z 1 ∂fi · (x + s(ym1 − xm1 )ˆ emj ) ds > 0. 0 ∂xm1

Now, assume that f satisfies the SKM condition in U . Then, f also satisfies the KM condition in U and so f is cooperative. Suppose for contradiction that there exist i 6= j, z ∈ Zij and ˆ > 0 such that z + hˆ ˆ Let h ej ∈ Zij for any h ∈ (0, 2h). ˆ ej . Note that x < y and xi = yi so that, due to the y = x + hˆ SKM condition, fi (y) > fi (x). On the other hand, we have Z hˆ ∂fi fi (y) = fi (x) + (x + hˆ ej )h dh = fi (x) 0 ∂xj

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

6

which is a contradiction. A totally ordered segment is a segment that can be written as J = {νa + (1 − ν)b, ν ∈ [0, 1]} for some a < b ∈ Rn . Given x ∈ U , we say that Df (x) is fully interconnected if ∂fi (x) 6= 0, ∀i 6= j. ∂xj Note that, if f is cooperative, Df (x) is fully interconnected if ∂fi (x) > 0, ∀i 6= j. ∂xj Proposition 1 yields the next sufficient condition in order that f satisfies the SKM condition. Proposition 2: Suppose that f is cooperative in U . Then, f satisfies the SKM on U provided that, for any totally ordered segment J ⊂ U , there exists x ∈ J such that Df (x) is fully interconnected. Proof: Let z ∈ Zij for some i, j ∈ N , i 6= j, and fix any ǫ > 0 such that wǫ = z + ǫˆ ej ∈ U . Consider the totally ordered segment J = {νz + (1 − ν)wǫ , ν ∈ [0, 1]} ⊂ U . Then, we can find x ∈ J such that Df (x) is fully interconnected. Note that x 6= z. Indeed, since z ∈ Zij , Df (z) is not fully interconnected. Thus, x = z + hǫ eˆj for some hǫ ∈ (0, ǫ]. Recalling that ǫ has been chosen arbitrarily, the result follows from Proposition 1. We say that the vector field f is fully interconnected in U if ∂fi (x) 6= 0, ∀i 6= j, ∀x ∈ U. ∂xj If f is cooperative, Df (x) is fully interconnected if ∂fi (x) > 0, ∀i 6= j, ∀x ∈ U. ∂xj Corollary 1: If the vector field f is cooperative and fully interconnected in U then f satisfies the SKM condition in U and the semiflow of the DVI (2) is SOP on K. Proof: Follows from Proposition 2 and Theorem 1. Full interconnection is a slightly stronger condition on f for ensuring the SKM condition, but easier to handle, with respect to the conditions in Propositions 1, 2. It also represents a strong form of irreducibility of f . C. Cooperative Affine and Linear DVIs Consider the important special case, in view of the applications to NNs, of DVIs of the type x˙ ∈ Ax + I − NK (x)

(7)

where x ∈ Rn , A ∈ Rn×n and I ∈ Rn . Matrix A is in general nonsymmetric (nonreciprocal). The DVI (7) is defined by an affine vector field f (x) = Ax + I if I 6= 0 or a linear vector field f (x) = Ax if I = 0. The vector field f (x) = Ax + I is cooperative in Rn if aij ≥ 0 for any i 6= j, it is irreducible in Rn if A is irreducible and it is fully interconnected in Rn if aij 6= 0 for any i 6= j. Proposition 3: The vector field f (x) = Ax + I satisfies the SKM condition in Rn if and only if it is cooperative and fully interconnected, i.e., if and only if aij > 0 for any i 6= j.

Proof: If aij > 0 for any i 6= j, then f satisfies the SKM condition in Rn by Corollary 1. Conversely, suppose that f satisfies the SKM condition in Rn . Then, f satisfies the KM condition and f is cooperative, i.e., aij ≥ 0 for any i 6= j. Suppose now, for contradiction, that there exist i 6= j such that aij = 0. Then, ∂fi (x)/∂xj = aij = 0 for any x ∈ Rn and, from (6), Zij = Rn . By the necessary part in Proposition 1 it follows that f cannot satisfy the SKM condition in Rn , which is a contradiction. Then, we have aij > 0 for any i 6= j. Corollary 2: If we have aij > 0 for any i 6= j then the vector field f (x) = Ax + I satisfies the SKM condition in Rn and the semiflow Φ generated by the DVI (7) is SOP on K. Proof: Follows from Proposition 3 and Theorem 1. D. Remarks 1) We have seen in Theorem 1 that the SKM condition implies that the semiflow Φ of the DVI (2) is SOP. However, the Example 1 in [29] shows that there are cooperative and fully interconnected DVIs, thus satisfying the SKM condition, for which the semiflow is SOP but not ESM. 2) An injective semiflow on Rn , as that generated by an ODE, is ESM if and only if it is SOP [28, Prop. 1.2]. We have seen in Section III that the semiflow Φ of the DVI (2) is not injective, moreover, as discussed in the previous remark, Φ may be SOP but not ESM. The considered class of cooperative DVIs (2) is believed to be a significant example in a finite dimensional space where the SOP property is more flexible than the property that Φ is ESM for addressing the L IMIT S ET D ICHOTOMY and generic convergence. 3) The assumption that f is cooperative and irreducible in U is not sufficient to ensure that f satisfies the SKM condition in U . Otherwise, Theorem 1 would imply that the semiflow is SOP, whereas, as shown in [29, Sec. 4.1], there are DVIs defined by cooperative and irreducible vector fields in Rn for which the semiflow is not SOP. 4) The SKM condition guarantees that the semiflow Φ of the DVI (2) is SOP. The next example shows that the SKM condition is however not necessary for SOP. Example 1. Consider a linear DVI (7) with I = 0,   −1 r 0 0 ··· s  s −1 r 0 ··· 0     0 s −1 r · · · 0    n×n A= . ..  ∈ R .. .. .. ..   .. . . . . .    0 ··· 0 s −1 r  r ··· 0 0 s −1

x ∈ Rn , r, s > 0 and r + s < 1. Matrix A is cooperative and irreducible, but not fully interconnected, so the DVI does not satisfy the SKM condition (Proposition 3). Nevertheless, it can be shown that its semiflow is SOP. Suppose that x belongs to the boundary of K and we have xj = 1 for some j ∈ N . It can easily be proved that (Ax)j < 0. Similarly, if xk = −1 for some k ∈ N , then (Ax)k > 0. This implies that on the boundary of K the linear vector field Ax points inwards. Then, given any

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

initial condition x ∈ K, x(t) belongs to the interior of K for any t > 0, hence it satisfies the linear system x˙ = Ax for t > 0. Since A is cooperative and irreducible, given x < y ∈ K we can conclude that x(t) ≪ y(t) for any t > 0, hence the semiflow is SM and it is also SOP. V. L IMIT S ET D ICHOTOMY AND G ENERIC C ONVERGENCE OF C OOPERATIVE DVI S We have seen in Theorem 1 that if f satisfies the SKM condition then the semiflow of the DVI (2) is SOP on K. Then, we can apply the main bulk of the theoretic results developed for SOP semiflows in [28] to study generic convergence of (2). Three main steps are involved. First of all, a L IMIT S ET D ICHOTOMY is established for the semiflow generated by (2) (Theorem 2). The L IMIT S ET D ICHOTOMY implies a result on convergence for points belonging to any ordered arc contained in K (Theorem 3) and, finally, the result on ordered arcs implies generic convergence of (2) in the whole set K (Theorem 4). Theorem 2 (L IMIT S ET D ICHOTOMY ): Suppose that f satisfies the SKM condition in U . Then, for any x, y ∈ K such that x < y, we have either 1) ω(x) < ω(y), or 2) ω(x) = ω(y) ⊂ E. Proof: By Theorem 1 the semiflow Φ of (2) is SOP on K, moreover any orbit of (2) has compact closure. Then, the result follows from Theorem 1.13 in [28]. Theorem 3: If f satisfies the SKM condition in U , and J ⊂ K is a totally ordered arc, then J\Q is either finite or countable. Proof: Follows from the L IMIT S ET D ICHOTOMY in Theorem 2 and Theorem 1.19 in [28]. Theorem 4 (Generic Convergence): Suppose that f satisfies the SKM condition in U . Then, µ(K\Q) = 0, i.e., the DVI (2) is almost quasi-convergent. If in addition the EPs of (2) are isolated then µ(K\C) = 0, i.e., the DVI (2) is almost convergent. Proof: Given any totally ordered arc J ⊂ K, we have from Theorem 3 that J\Q is either finite or countable. By an argument based on Fubini’s theorem, as that in the proof of Theorem 4.1 in [31], it is possible to show that µ(K\Q) = 0, i.e., (2) is almost quasi-convergent. The result on almost convergence follows from the fact that for isolated equilibria quasi-convergence is equivalent to convergence. As usual we can subdivide the hypercube K in 3n subsets as follows. Let Γ = {γ ∈ Rn : γi ∈ {−1, 0, 1}, i ∈ N }. Each subset Ξγ is identified by a vector γ ∈ Γ, namely, Ξγ = {x ∈ K : |xi | < 1, γi = 0; xi = γi , γi ∈ {−1, 1}, i ∈ N }. Subsets Ξγ are a partition of K, i.e., we have Ξγa ∩ Ξγb = ∅ for any γa 6= γb ∈ Γ and ∪γ∈Γ Ξγ = K. Let L = Lγ = {i ∈ N : γi = 0} (indexes of the linear variables in Ξγ ) and S = Sγ = N \L (indexes of the saturated variables in Ξγ ). Let |L| be the cardinality of L, i.e., the number of linear variables in Ξγ . Consider a subset Ξγ such that 1 ≤ |L| < n. Note that the closure of Ξγ , cl(Ξγ ), is a subset of an |L|-dimensional affine subspace of Rn . We denote by µ|L| the |L|-dimensional

7

Lebesgue measure. Theorem 4 guarantees generic convergence of the solutions of the DVI (2) with respect to the Lebesgue measure µ in Rn (Section II). By means of Theorem 3 we can prove the next stronger result on generic convergence in Ξγ with respect to the lower dimensional measure µ|L| relative to the linear variables in Ξγ . Theorem 5 (Generic Convergence for Saturated Subsets): Suppose that f satisfies the SKM condition in U . Then, for any subset Ξγ of K, such that 1 ≤ |L| < n, we have µ|L| (cl(Ξγ )\Q) = 0. If in addition the EPs of (2) are isolated then we have µ|L| (cl(Ξγ )\C) = 0. Proof: Without loss of generality we can suppose that γi = 0 for all i ∈ {1, 2, . . . , m} and |γi | = 1 for all i ∈ {m + 1, m + 2, . . . , n} = S, for some 1 ≤ m = |L| < n. If m = 1, then cl(Ξγ ) reduces to a totally ordered line segment. By Theorem 3, cl(Ξγ )\Q is either finite or countable, hence we have µ1 (cl(Ξγ )\Q) = 0. Now, assume m > 1. Since cl(Ξγ )\Q ⊂ cl(Ξγ ) = [−1, 1]m × {γS }, where γS = (γm+1 , γm+2 , . . . , γn )′ , we can find a subset Θ of [−1, 1]m such that cl(Ξγ )\Q = Θ × {γS } (Θ is the projection of cl(Ξγ )\Q onto Rm ). To complete the proof, it suffices to show that Θ is a zero measure subset of Rm . Fix any totally ordered segment J ⊂ [−1, 1]m. Then, J × {γS } ⊂ Rn is a totally ordered segment contained in cl(Ξγ ). Due to Theorem 3 the set (Θ × {γS }) ∩ (J × {γS }) = (cl(Ξγ )\Q) ∩ (J × {γS }) is either finite or countable. In turn, this implies that the set Θ ∩ J is either finite or countable. Since J is arbitrary, by using an argument based on Fubini’s theorem as that in the proof of Theorem 4.1 in [31] we can conclude that Θ is a zero measure subset of Rm . Finally, by combining the convergence results in this section, with those in Section IV-B, we obtain the next result which is useful in view of the applications to NNs. Corollary 3: The results on generic convergence in Theorems 4, 5 hold in particular if f is cooperative and fully interconnected in U or if f (x) = Ax + I is an affine or a linear vector field and we have aij > 0 for any i 6= j. Proof: Follows from the fact that in both cases the SKM condition is satisfied (Corollaries 1, 2). VI. A PPLICATION

TO

N EURAL N ETWORKS

A. Cooperative LSSMs In the article [24], Li, Michel and Porod considered a class of NNs defined by affine or linear differential equations evolving in a closed hypercube of Rn . Such NNs, also named LSSMs, are described by the equations x˙ = Ax + I

(8)

with the constraints −1 ≤ xi ≤ 1, i ∈ N or, equivalently, x = (x1 , x2 , . . . , xn )′ ∈ K. In model (8), A = (aij ) ∈ Rn×n is the neuron interconnection matrix and I ∈ Rn is a constant input. The peculiarity is that LSSMs are

CONVERGENCE OF DIFFERENTIAL VARIATIONAL INEQUALITIES

defined in a closed hypercube of Rn , while affine or linear systems are typically defined in an open subset of Rn . The LSSMs can be considered as modified/improved Hopfield NN models. LSSMs have been originally introduced to remedy some shortcomings in the original Hopfield NNs [16]. In fact, it is hard to find the locations of the EPs for sigmoid activations as in the Hopfield model, so that there are difficulties to develop design procedures for Hopfield associative memories and check their performances. Instead, as shown in [24], the EPs of LSSMs can be analytically found by solving suitable sets of affine or linear systems and this in turn makes it possible to develop an effective design procedure for LSSM-based associative memories. A dedicated package for analyzing and designing LSSMs has been developed for MATLAB [35]. The LSSM model is also more easy to implement with electronic circuits with respect to the Hopfield model, see [24, Sec. III] for the details. After its introduction, the LSSM model (8), and discrete-time versions of the same model, have been widely investigated in the literature both from a theoretic and an application viewpoints in the context of pattern formation and associative memories, real-time optimization in closed domains, and digital filtering and saturation arithmetic, see, e.g., [36]–[41] and references therein. Some related models are considered in [34], [42]–[44]. In [40] it is noticed that the LSSM model (8) can be rewritten as x˙ ∈ Ax + I − ∂ψK (x) (9) where ψK (x) = 0 when x ∈ K and ψK (x) = +∞ when x∈ / K is the indicator of K [21, p. 22] and ∂ψK (x) is the subgradient of ψK (x) [21]. We have [21, Definition 5, p. 33] ∂ψK (x) = NK (x) for any x ∈ K, so that the LSSM model is equivalent to the DVI (2) with a vector field f (x) = Ax + I. We say that an LSSM is nonreciprocal if A is nonsymmetric, cooperative if aij ≥ 0 for any i 6= j, irreducible if A is irreducible and fully interconnected if aij 6= 0 for any i 6= j. Theorem 6: A cooperative and fully interconnected LSSM (8) is almost quasi-convergent and it is convergent if the EPs are isolated. Proof: Follows from Corollary 3. To the authors’ knowledge Theorem 6 is the only existing result on convergence of nonsymmetric LSSMs possessing multiple EPs. Indeed, [24] establishes convergence of LSSMs with multiple EPs, by means of a Lyapunov approach, in the case of a symmetric interconnection matrix A, whereas [38], [39], [41], give results on global stability of nonsymmetric LSSMs in the special case where there exists a unique EP. We also point out that the hypothesis of full interconnection of LSSMs as in Theorem 6 is not overly restrictive. In fact, the typical synthesis procedures for associative memories based on Hopfield NNs, or LSSMs, lead usually to fully interconnected matrices [3], [24]. Almost quasi-convergence and almost convergence, as in Theorem 6, are slightly weaker properties with respect to convergence but almost as useful in the practical applications of NNs. A detailed discussion can be found in [17, Sec.

8

III], [4, Sec. 5] and [1, Sec. V]. Here we only point out that if the LSSM (8) is quasi-convergent, then there there is probability zero to pick at random initial conditions in K originating non-convergent solutions. Moreover, these nonconvergent solutions are unstable and they cannot be observed in a physical implementation of (8) due to noise. Theorem 6 is potentially useful in the design of associative memories based on LSSMs and may lead to guidelines for their implementation using analog or mixed-signal hardware. It is pointed out that if a nominal cooperative LSSM has a fully interconnected matrix A0 , small perturbations of A0 preserve the property of full interconnection. Theorem 6 thus ensures robustness of convergence with respect to small perturbations of A0 . As discussed in Section I, robustness properties of this kind are extremely important in view of the applications, since deviations from nominal values are unavoidable in the electronic implementation of the neuron interconnections. The next example provides an illustration. Example 2. Let us consider a nominal third-order cooperative LSSM x˙ = A0 x + I (10) where x = (x1 , x2 , x3 )′ ∈ R3 satisfies −1 ≤ xi ≤ 1, i = 1, 2, 3 or equivalently x ∈ K3 = [−1, 1]3 ,   2. 1.4 1.4 A0 =  1.4 2. 1.4  1.4 1.4 2.

is a symmetric and fully interconnected matrix and I = (1.5, −3, −1.5)′. By means of Theorem 4.2 in [24], which characterizes the stable and unstable EPs of LSSMs, it can be checked that (10) has four locally asymptotically stable (AS) EPs on the following vertexes of K3 (−1, −1, −1)′, (1, −1, −1)′, (1, −1, 1)′ , (1, 1, 1)′

(11)

and three unstable EPs at (0.65, −1, −1)′, (1, −1, 0.75)′, (1, 0.1, 1)′ . Since A0 is symmetric, (10) admits a global Lyapunov function and it is convergent by Theorem 7 in [12]. The nominal LSSM (10) can be considered as a simple associative memory storing the four AS EPs (11). Consider now the perturbed LSSM x˙ = (A0 + ǫAp )x + I

(12)

with −1 ≤ xi ≤ 1, i = 1, 2, 3 where ǫ is a real parameter and   0 1 1 0 0  Ap =  0 −1 −1 0 is a nonsymmetric matrix. It can be checked by using Theorem 4.2 in [24] that the vertex (−1, −1, −1)′ continues to be an AS EP of (12) for −1.65 < ǫ < 3.15. Similarly, (1, −1, −1)′ and (1, 1, 1)′ are AS EPs of (12) for ǫ < 0.35 and −3.15