A NEW MERIT FUNCTION FOR NONLINEAR COMPLEMENTARITY PROBLEMS AND A RELATED ALGORITHM
Francisco Facchinei 1 and Jo~ao Soares 2 1
Universita di Roma \La Sapienza" Dipartimento di Informatica e Sistemistica Via Buonarroti 12, 00185 Roma, Italy e-mail:
[email protected] 2
Columbia University Graduate School of Business 804 Uris Hall, New York NY 10027, USA e-mail:
[email protected] Abstract: We investigate the properties of a new merit function which allows us to reduce
a nonlinear complementarity problem to an unconstrained global minimization one. Assuming that the complementarity problem is de ned by a P0 -function we prove that every stationary point of the unconstrained problem is a global solution; furthermore, if the complementarity problem is de ned by a uniform P -function, the level sets of the merit function are bounded. The properties of the new merit function are compared with those of the Mangasarian-Solodov's implicit Lagrangian and Fukushima's regularized gap function. We also introduce a new, simple, active-set local method for the solution of complementarity problems and show how this local algorithm can be made globally convergent by using the new merit function.
Key Words: Nonlinear complementarity problem, merit function, semismoothness, global convergence, quadratic convergence.
1 Introduction We consider the nonlinear complementarity problem:
F (x) 0;
x 0;
F (x) x = 0; T
(NC )
where F : IRn ! IRn is everywhere continuously dierentiable. Recent research on the numerical solution of Problem (NC) has focused on the development of globally convergent algorithms. To this end, two approaches have been investigated: the transformation of the nonlinear complementarity problem into a minimization one and the use of continuation methods. Strictly related to the rst approach is the equation-reduction approach, which tries to solve Problem (NC) by solving an equivalent system of equations, while interior-point methods are close to the continuation approach. There exists a considerable body of literature on the theoretical properties of continuation methods, and interior point methods appear to be valuable in the practical solution of linear complementarity problems, however, in the last years the minimization approach seems to have raised much more interest and most if not all the proposal and developments of practical algorithms for the solution of nonlinear complementarity problems follow this approach. The minimization approach is based on the introduction of a merit function whose (possibly constrained) global minima are the solutions of the nonlinear complementarity problem; the latter problem is then solved by applying a suitable minimization algorithm to the merit function. The de nition of a merit function is often, even if not always, based on a preliminary equation reformulation of the complementarity problem. More precisely one rst de nes a system of equations H (x) = 0 whose solutions coincide with the solutions of the complementarity problem and then uses as merit function kH (x)k2 (or kH (x)k). Before continuing our discussion we give a formal de nition of merit function.
De nition 1.1 Let C IRn be given. A merit function for Problem (NC) is a nonnegative function M : C ! IR such that x is a solution of Problem (NC) i x 2 C and M (x) = 0, i.e. i the solutions of Problem (NC) coincide with the global solutions of the problem
min M (x); x 2 C;
(PM )
with optimal value 0.
Note that if the complementarity problem does not have solutions, then a merit function either has global solutions with positive value or has no global solutions at all. It is not dicult to nd a merit function for Problem (NC), the challenging task is to nd a merit function which enjoys properties which are useful from the computational point of view. For example one could consider the merit function M (x) = F (x) x whose global minimizers on the set C := fxjx 0; F (x) 0g are the solutions of the complementarity problem (NC). But seeking these global minimizers is not easy because, even in very simple cases, the structure of C may be very complicated and the minimization problem can have stationary points which are not global solutions. There have been several proposals of merit functions (or equation reformulations); the seminal work is [18], where a smooth equation reformulation is given, other T
2
papers related to smooth reformulations include, e.g., [7, 11, 12, 15, 16, 17, 20, 22, 34]; nonsmooth reformulations, instead, are used in, e.g., [4, 8, 13, 24, 30, 33, 37, 38]. It is often dicult, if at all possible, to compare dierent merit functions, however we think that the main points which should be considered when evaluating a merit function M are: 1. the conditions under which every stationary point of Problem (PM) is a global solution of Problem (PM); 2. the conditions under which the sets L() := fx 2 IRn : x 2 C; M (x) g are bounded; 3. the degree of smoothness of M ; 4. the structure of the set C ; obviously all these points have a great practical signi cance. The numerical performance of algorithms based on Problem (PM) should also be considered, even if one should always keep in mind that the numerical results are also dependent on the particular algorithm chosen to solve Problem (PM). There is generally a trade-o between simplicity of Problem (PM) and its properties. For example, dierentiable merit functions tend to be more ill conditioned than nondierentiable ones and do not generally allow us to develop superlinearly convergent algorithms for degenerate problems, on the other hand nondierentiable merit functions do not have these drawbacks, but generally require ad hoc complex minimization algorithms; constrained equivalent reformulations are usually valid under weaker assumptions than their unconstrained counterparts, but solving a constrained minimization problem is more dicult than solving an unconstrained one etc. In order to put this work in perspective and also to illustrate the points discussed above, we now brie y recall the properties of two recently proposed merit functions: the implicit Lagrangian of Mangasarian and Solodov ([20]) and Fukushima's regularized gap function ([11], but see also [1]). These two merit functions are, in our opinion, among the most interesting proposals in the eld. The implicit Lagrangian is de ned as (to simplify we have xed a free parameter) Mms (x) := x F (x) + 41 jj[x ? 2F (x)]+jj2 ? kxk2 + jj[F (x) ? 2x]+jj2 ? kF (x)k2 : Mms is a merit function with C = IRn , so that solving (NC) is equivalent to nding the unconstrained global solutions of the problem fmin Mms (x)g. Furthermore the merit function Mms enjoys the following properties. T
- Mms is continuously dierentiable. - If the Jacobian of F is a positive de nite matrix for every x then every stationary point of Problem (PM) is a global minimum point of Problem (PM) [39].
- If F is strongly monotone and globally Lipschitzian then the sets L() are bounded [39]. 3
The regularized gap function of Fukushima is de ned for variational inequalities. When specialized to nonlinear complementarity problems it becomes (to simplify we have xed a free parameter) Mfa(x) := x F (x) + 21 jj[x ? F (x)]+jj2 ? kxk2 : Mfa is a merit function with C = IRn+ , so that solving (NC) is equivalent to nding the global solutions of the simply constrained minimization problem fmin Mfa (x) : x 2 IRn+ g. Furthermore the merit function Mfa enjoys the following properties. - Mfa is continuously dierentiable. T
- If the Jacobian of F is a positive de nite matrix for every x, then every stationary point of Problem (PM) is a global minimum point of Problem (PM) [11].
- If F is strongly monotone then the sets L() are bounded [35]. We note that the implicit Lagrangian merit function is simpler than the regularized gap function, since it only requires an unconstrained minimization, but the condition to have bounded level sets is stronger for the implicit Lagrangian than for the regularized gap function. The purpose of this paper is twofold: on one hand we study a new merit function which can be used to reformulate the nonlinear complementarity problem as a smooth, unconstrained minimization problem, on the other hand we propose a globally convergent algorithm for the solution of Problem (NC) and study its theoretical properties. The new merit function is based on the following, simple, two variables, convex function:
p
(a; b) := a2 + b2 ? (a + b): The most interesting property of this function is that, as it is easily veri ed,
(a; b) = 0
()
a 0; b 0; ab = 0;
(1)
note also that is continuously dierentiable everywhere but in the origin. The function was introduced by Fischer [9] in 1992, since then it has attracted the attention of many researchers and it has proved to be a valuable tool [6, 10, 12, 15, 17, 28, 36]. Exploiting (1) it is readily seen that the following system of nonsmooth equations is equivalent to the nonlinear complementarity problem:
2 66 (x1; F. 1(x)) .. 66 (x) = 666 (xi; Fi (x)) 66 .. . 4
(xn; Fn(x))
It is then obvious that the function (x) := k(x)k2 = 4
n X i=1
3 77 77 77 = 0: 77 75
(xi; Fi(x))2
is a merit function with C = IRn , and that solving (NC) is equivalent to nding the unconstrained global solutions of the problem fmin (x)g. We shall prove that the merit function enjoys the following properties:
- is continuously dierentiable; furthermore, if every Fi is an SC1 function, then also is
an SC1 function (we recall that this means that is continuously dierentiable and its gradient is semismooth, see next section for a formal de nition).
- If F is a P0-function then every stationary point of Problem (PM) is a global minimum point of Problem (PM).
- If F is a uniform P -function then the sets L() are bounded. Furthermore, we should also add that is semismooth (see [21, 29]), and this is a signi cant analytical property; in particular we note that semismoothness is a stronger and far reaching property than B-dierentiability, the latter being a property often used in recent years in the study of nonlinear complementarity problems [13, 24, 37, 38]. We remark that the theoretical properties of appear to be superior to those of the implicit Lagrangian and of the regularized gap function. In fact the function allows us to solve the nonlinear complementarity problem by an unconstrained minimization and the conditions under which every stationary point of the merit function is a global minimizer and the level sets are bounded are substantially weaker. Also the dierentiability properties of seem more interesting, and actually the semismoothness of and the SC1 property of are very important from an algorithmic point of view [5, 21, 25, 26, 27, 29]. It is worth noting that the system (x) = 0 is nonsmooth, but the merit function (x) = k(x)k2 is, surprisingly enough, smooth. Thus our reformulation of the complementarity problem as a minimization one seems to inherit the advantages of both nonsmooth and smooth merit functions, while mitigating their drawbacks. In particular we note that, since is continuously dierentiable, it is very easy to force global convergence of algorithms by using the gradient of the merit function, while, on the other hand, we are able to prove, for the rst time in the case of smooth merit functions, quadratic convergence even to degenerate solutions. The merit function has also been independently introduced by Geiger and Kanzow [12]. Their results are however weaker than those reported above or simply dierent. In particular they showed that every stationary point of the merit function is a global minimum point if F is monotone, while the level sets of are bounded if F is strongly monotone. The analysis of the dierential properties of is cruder than ours and, to de ne superlinear convergent algorithms for the solution of the complementarity problem, they require the solutions to be nondegenerate, which is not the case for the algorithm described in this paper. On the other hand Geiger and Kanzow describe an interesting algorithm for the solution of strictly monotone complementarity problems which does not require the evaluation of the Jacobian of F . In this paper we also illustrate a possible use of the merit function through the description of a technique for globalizing a local algorithm for the solution of complementarity problems. The local algorithm itself is, we think, worth of attention; it is an active set algorithm which reduces 5
the solution of the complementarity problem to the solution of a lower dimensional system of smooth equations by Newton's method. This local algorithm is quadratically convergent under a mild assumption which is weaker than the classical regularity assumption required by the method of Robinson and Josephy [14, 32] and in particular it does not require nondegeneracy of the solution, as opposed to the methods of [7, 11, 12, 13, 15, 16, 20, 22, 34]; furthermore it requires just the solution of a reduced linear system at each iteration. The local algorithm is globalized in a very cheap and simple way by using the merit function . We show that the overall algorithm is globally convergent and, under appropriate, mild assumptions, eventually reduces to the local, fast algorithm, thus retaining its convergence rate. Furthermore the algorithm is nitely convergent on a wide class of linear complementarity problems. The numerical behavior of the algorithm is illustrated in [6]; the results reported there show that the algorithm is quite promising. We nally point out that, using the proof techniques employed to study the local algorithm, we establish a new sucient condition for the local uniqueness of solutions to nonlinear complementarity problems. This paper is organized as follows. In the next section we recall various de nitions related to complementarity problems and to dierentiability of functions which will be needed in the sequel. In Section 3 we analyze the dierential properties of and , while in Section 4 we prove the main properties of the function . A local algorithm for the solution of Problem (NC) and its globalization through the merit function are discussed in Section 5. In the last section we make some conclusive remarks. We close this section by giving a list of the notation employed. If f : IRn ! IR is dierentiable, rf (x) is the gradient of the f at x, it is a column vector. If F : IRn ! IRn is dierentiable, rF (x) is an n n matrix whose i-th column is the gradient of Fi (x). If f : IRn ! IR is a locally Lipschitz function @f (x) is the generalized gradient, i.e. set of subgradients of f at x. This is a set of column vectors. If F : IRn ! IRn is Lipschitz then @F (x) is the generalized Jacobian of F at x. As usual there is an inconsistency in the notation: If F is single valued, its generalized Jacobian is a set of row vectors and hence does not coincide with the generalized gradient, but with its transpose, if F is dierentiable its generalized Jacobian is not rF (x) but rF (x) . We use the standard notation in nonsmooth analysis for \operations" between sets. For example, if A and B are sets of n-vectors, T
A + B = fc 2 IRn : c = a + b; with a 2 A; b 2 Bg: If A is a set of n n matrices and B is a set of n-vectors,
AB = fc 2 IRn : c = ab; with a 2 A; b 2 Bg:
kk denotes the Euclidean norm and S (x; ) is the closed Euclidean sphere of center x and radius , i.e. S (x; ) = fx 2 IRn : kx ? xk g. If is a subset of IRn , distfxj g := inf y2 ky ? xk, denotes the (Euclidean) distance of x to . 6
If M is an n n matrix with elements Mij , i; j = 1; : : :n, and I and J are index sets such that I; J f1; : : :ng, we denote by MIJ the jI j jJ j submatrix of M consisting of elements Mij , i 2 I , j 2 J . If w is an n vector, we denote by wI the subvector with components wi, i 2 I .
2 Background material In this section we review some de nitions related to nonlinear complementarity problems and to dierential properties of functions which will be used in the sequel. A solution to the nonlinear complementarity problem (NC) is a vector x 2 IRn such that
F (x) 0;
x 0;
F (x) x = 0: T
Associated to the solution x we de ne three index sets:
:= fijxi > 0g;
:= fijxi = 0 = Fi(x)g;
:= fijFi(x) > 0g
The solution x is said to be nondegenerate if = ;. In the following de nition we introduce two notions of regularity which play a central role in our analysis and which have also been widely used in the analysis of nonlinear complementarity problems.
De nition 2.1 We say that the solution x is - b-regular if, for every index set : [ , the principal submatrix rF (x) is nonsingular;
- R-regular if rF (x) is nonsingular and the Schur complement of rF(x) in
rF(x) rF (x) rF (x) rF (x)
!
is a P-matrix (see below).
We recall that the above mentioned Schur complement is de ned by
rF (x) ? rF (x)rF(x)?1rF (x): Note that R-regularity coincides with the notion of regularity introduced by Robinson in [32] (see also [31], where the same condition is called strong regularity) and is strictly related to similar conditions used, e.g., in [8, 22, 24]. If x is a nondegenerate solution, then the b-regularity condition can be equivalently stated as: the vectors rFi (x), i 2 , and ei , i 2 are linearly independent (ei indicates the ith column of the identity matrix); b-regularity has been employed, e.g., in [15, 20, 22]. It is known that R-regularity implies b-regularity [24] and local uniqueness of the solution x [31]; furthermore, also b-regularity implies the local uniqueness of the solution x, see Proposition 5.4. We shall also employ several de nitions concerning matrices and functions and need some related properties. 7
De nition 2.2 A matrix M 2 IRnn is a - P0 -matrix if every of its principal minors is non-negative; - P -matrix if every of its principal minors is positive; - R0-matrix if the linear complementarity problem
Mx 0; x 0; x Mx = 0; T
has 0 as its unique solution.
It is obvious that every P -matrix is also a P0 -matrix and it is known [3] that every P -matrix is an R0-matrix. We shall also need the following characterization of P0 -matrices [3].
Proposition 2.1 A matrix M 2 IRnn is a P0-matrix i for every nonzero vector x there exists an index i such that xi = 6 0 and xi(Mx)i 0. De nition 2.3 A function F : IRn ! IRn is a - P0 -function if, for every x and y in IRn with x 6= y , there is an index i such that
xi 6= yi;
(xi ? yi )[Fi(x) ? Fi (y )] 0:
- P -function if, for every x and y in IRn with x 6= y , there is an index i such that
(xi ? yi )[Fi(x) ? Fi (y )] > 0: - uniform P -function if there exists a positive constant such that, for every x and y in IRn , there is an index i such that
(xi ? yi )[Fi (x) ? Fi (y )] ky ? xk2: - monotone function if, for every x and y in IRn ,
(x ? y ) [F (x) ? F (y )] 0: T
- strictly monotone function if, for every x and y in IRn with x 6= y ,
(x ? y ) [F (x) ? F (y )] > 0: T
- strongly monotone function if there is a positive constant such that, for every x and y in IRn , (x ? y ) [F (x) ? F (y )] ky ? xk2 : T
8
It is obvious that every monotone function is a P0 -function, every strictly monotone function is a P -function, and that every strongly monotone function is a uniform P -function. Furthermore it is known that the Jacobian of every continuously dierentiable P0 -function is a P0 -matrix and that if the Jacobian of a continuously dierentiable function is a P -matrix for every x, then the function is a P -function. If F is ane, that is if F (x) = Mx + q , then F is a P0 -function i M is a P0 -matrix, while F is a (uniform) P -function i M is a P -matrix (note that in the ane case the concept of uniform P -function and P -function coincide). In the remaining part of this section we recall some basic de nitions about semismoothness and SC1 functions. Semismooth functions were introduced in [21] and immediately showed to be relevant to optimization algorithms. Recently the concept of semismoothness has been extended to vector valued functions [29]. De nition 2.4 Let F : IRn ! IRm be locally Lipschitz at x 2 IRn . We say that F is semismooth at x if lim Hv 0 (2) H 2@F (x+tv ) v !v;t#0 0
0
exists for any v 2 IRn . Semismooth functions lie between Lipschitz functions and C 1 functions. Note that this class is strictly contained in the class of B-dierentiable functions. It is known that [21, 29] (a) Continuously dierentiable functions and convex functions are semismooth; also the composites of semismooth functions are semismooth. (b) If a function F is semismooth at x, then F is directionally dierentiable at x, and the directional derivative F 0 (x; d) is equal to the limit (2). We can now give the de nition of SC1 function. De nition 2.5 A function f : IRn ! IR is said to be an SC1 function if f is continuously dierentiable and its gradient is semismooth. SC1 functions can be viewed as functions which lie between C 1 and C 2 functions. Semismooth systems of equations form an important class, since they often occur in practice and many of the classical methods for their solution (e.g. Newton's method) can be extended to solve such problems [25, 27, 29]. Analogously, many classical results concerning the minimization of C 2 functions can be extended to the minimization of SC1 functions (see e.g. [5, 26] and references therein), which, in turn, play an important role in many optimization problems. Under very mild dierentiability assumptions on F , the new merit function we will introduce in the next section is an SC1 function.
3 Dierential results In this section we shall study the dierential properties of and . In particular, we shall give an estimate of the generalized Jacobian of and a sucient condition for the nonsingularity of 9
all the generalized Jacobians at a solution of (NC). Then we shall establish that is semismooth, continuously dierentiable and that, if F is an SC1 function, also is SC1 . We recall that, unless otherwise stated, we assume that F is everywhere continuously dierentiable.
Proposition 3.1
@ (x) (A(x) ? I ) + rF (B(x) ? I ) (3) where I is the n n identity matrix and A(x) and B (x) are possibly multivalued n n diagonal matrices whose ith diagonal element is given by Aii(x) = k(x ; Fxi (x))k ; Bii (x) = k(x F; iF(x()x))k i i i i if (xi ; Fi(x)) = 6 0 and by Aii(x) = i; Bii(x) = i; for every (i; i) such that k(i; i)k 1 if (xi ; Fi(x)) = 0: T
Proof. By known rules on the evaluation of the generalized Jacobian (see [2], Proposition 2.6.2 (e))
@ (x) (@ 1(x) @ n(x)): If i is such that (xi ; Fi(x)) 6= 0, then it is easy to check that i(x) is dierentiable and F (x) x i i ri(x) = r(xi; Fi(x)) = kx ; F (x)k ? 1 ei + rFi(x) kx ; F (x)k ? 1 : i i i i If i is such that (xi ; Fi(x)) = 0, by using the theorem on the generalized gradient of a composite T
function (see [2], Theorem 2.3.9 (iii)) and recalling that @ k0; 0k = f(i; i) : k(i; i)k 1g;
we get
@ i(x) = @(xi; Fi(x)) = (i ? 1)ei + rFi(x)(i ? 1):
From these equalities the proposition easily follows.
2
Exploiting the estimate (3) it is now possible to give a sucient condition for the nonsingularity of all the generalized Jacobians of at a solution of the nonlinear complementarity problem. This result is important from the algorithmic point of view, see Section 5 and [29]. Proposition 3.2 Suppose that x is an R-regular solution of Problem (NC). Then every matrix in @ (x) is nonsingular. Proof. Using the expression (3) and taking into account that x is a solution of the nonlinear complementarity problem, any matrix C belonging to @ (x) can be written in the following partitioned form 0 1 ?r F r F ( B ? I ) 0 B CC B B ?rF rF (B ? I ) + (A ? I ) 0 C C=B (4) CC B @ ?rF A rF (B ? I ) ?I
T
10
It is easy to see that these C are nonsingular i it is nonsingular the \left upper corner"
1 0 ?r F r F (B ? I ) A G=@ ?rF rF (B ? I ) + (A ? I )
(5)
Showing that the matrix G is nonsingular is equivalent to showing that the only solution of the following system ! y =0 ?Gy = ?G
y
is the zero vector ( we have changed sign for simplicity). This system can be rewritten as
8 < rFy + rF (I ? B )y = 0 : rF y + rF (I ? B )y = ?(I ? A )y from which, recalling that rF is nonsingular by the R-regularity assumption, we obtain, solving the rst equation respect to y and substituting into the second equation,
8 ?1 rF (I ? B )y < y = ?rF (6) : (rF ? rF rF ?1 rF )(I ? B )y = ?(I ? A )y ?1 rF ) = (G=rF) is by de nition the Schur complement of rF where (rF ? rF rF in G and is hence a P-matrix by the R-regularity assumption. Then showing the nonsingularity of G is equivalent to showing that the only vector which solves the second equation of (6), i.e. (G=rF)(I ? B )y = ?(I ? A )y
(7)
is y = 0. We proceed by contradiction, assume that there exists a solution y 6= 0, and consider two cases. 1) (I ? B )y = 0. De ne I = fi : (y )i 6= 0g. Note that I 6= ; because we are assuming (y )i 6= 0. This means that Bii = 1 for every i 2 I , which, in turn, implies Aii = 0 for every i 2 I by the de nition of the matrices A and B. Hence ?(I ? A )y 6= 0 and this is absurd. 2) (I ? B )y 6= 0. The components of (I ? B )y and ?(I ? A )y which are both nonzero (if any) have opposite signs. This implies, by (7), [(I ? B )y ]i [(G=rF)(I ? B )y ]i 0; 8i 2 : Since (G=rF) is a P-matrix this is only possible if (I ? B )y = 0, and again we have a contradiction and the proof is complete. 2 Another important property of is that it is a semismooth function. Also this property is very important from the computational point of view.
Proposition 3.3 The function is semismooth. 11
Proof. is semismooth i every of its components is semismooth [29]. But i(x) is the composite of the convex function : IR2 ! IR and of the dierentiable function (xi ; Fi(x)) : IRn ! IR2 . Since convex and dierentiable functions are semismooth and the composit of T
2
semismooth functions is semismooth the proposition is proved.
We now pass to consider the dierential properties of the function . The rst result is somewhat surprising, and states that is continuously dierentiable.
Proposition 3.4 (x) is continuously dierentiable and its gradient is @ (x) (x). T
Proof. By known rules on the calculus of generalized gradients (see [2], Theorem 2.6.6) it
holds that @ (x) = @ (x) (x). Since it is easy to check that @ (x) (x) is single valued everywhere since the zero components of (x) cancel the \multivalued columns" of @ (x) , we have by Corollary to Theorem 2.2.4 in [2] that (x) is continuously dierentiable. 2 T
T
T
The second result about the dierentiability properties of is that if F is SC1 then also is SC1 . This result will not be explicitly used in this paper, but we think that it is of great signi cance and that it also explains the good numerical behavior of algorithms based on the merit function .
Proposition 3.5 If every Fi is an SC1 function, then (x) is an SC1 function. Proof. The function = Pn (x ; F )2 is SC1 if every i=1
i i
2 = 2(x2i + Fi2 + xi Fi) ? 2(xi + Fi)kxi; Fik is SC1 . It is obvious that it is sucient to show that the term (xi + Fi )kxi; Fi k is SC1 . It is easy to check that (xi + Fi )kxi ; Fik is continuously dierentiable and that its gradient is
h i 8 < (ei + rFi) kxi; Fik + (xi + Fi ) kx x;F k ? 1 ei + kxF;F k ? 1 rFi if (xi; Fi) 6= (0; 0) :0 if (xi ; Fi ) = (0; 0) i
i
i
i
i
i
Again to check that this gradient is semismooth we only need to check that every component is semismooth, and this, in turn reduces to checking that the \troublesome" terms
8 (x +F )F < kx ;F k if (xi; Fi) 6= (0; 0) :0 if (xi; Fi ) = (0; 0) i
i
i
i
i
8 (x +F )x < kx ;F k if (xi; Fi) 6= (0; 0) :0 if (xi ; Fi ) = (0; 0) are semismooth (note that the term kxi; Fi k is semismooth since is the composite of the convex,
and
i
i
i
i
i
and hence semismooth, norm function and semismooth functions). We will check this only for the rst term, the proof for the second one is analogous. 12
Since the composite of semismooth functions is semismooth we only have to show that 8 (a+b)b < if (a; b) 6= (0; 0) (a; b) = : ka;bk 0 if (a; b) = (0; 0) is semismooth. First we show that it is locally Lipschitzian. This is obvious everywhere but in the origin. So let us consider this point. We rst note that (a + b)b j(a + b)bj j(a + b)j jbj p ka; bkka; bk p ? 0 = = 2 = 2k(a; b) ? (0; 0)k: (8)
ka; bk
ka; bk
ka; bk
ka; bk
Furthermore, in points dierent from the origin, it is readily seen that r (c; d) is given by (r (c; d))1 =
dkc;dk?(cd+d2 ) kc;dk2 (c+2d)kc;dk?(cd+d2 ) kc;dk2 c
kc;dk
(9)
; (r (c; d))2 = and it is easy to verify that the norm of r (c; d) is bounded on any bounded set which does not contain the origin. Consider now a convex, open, bounded neighborhood of the origin. We want to show that there exists a positive costant L such that, for every pair of points y and z belonging to we have j (z ) ? (y )j Lkz ? y k. To this end we consider two cases. a) The origin does not belong to the closed segment [y; z ]. In this case we can apply the theorem of the mean and obtain j(z) ? (y)j j(y) + r(w) (z ? y) ? (y)j M kz ? yk; where w is a point belonging to the open segment (z; y ) and M is any positive constant majorazing the norm of the gradient of on the bounded set n f0g. b) The origin belongs to the closed segment [y; z ]. In this case we have kz ? y k = kz k + ky k, so that, exploiting (8) we can write p p j(z) ? (y)j j(z) ? 0j + j(y) ? 0j 2(kzk + kyk) = 2kz ? yk: p Hence the local Lipschitzianity of in the origin is proved with L = maxf 2; M g. To check semismoothness we also only have to check semismoothness in (0; 0), since in other points (a; b) is continuously dierentiable and hence semismooth. To check semismoothness in (0; 0) we employ Theorem 2.3 (iv) in [29] which states that the locally Lipschitzian function is semismooth at (0; 0) if and only if, for every 2 @ ((0; 0) + (c; d)), with (c; d) ! 0, it holds that d kc;dk
T
!
T
c ? 0((0; 0); (c; d)) = o(k(c; d)k): d
(10)
To this end we rst note that it is easy to check, using the very de nition of directional derivative, that 0((0; 0); (c; d)) = (c; d): (11) Furthermore, taking into account that for every (c; d) 6= (0; 0), ((0; 0)+ (c; d)) is dierentiable, the vector in the theorem of Qi-Sun reduces to r (c; d). Employing (11) and (9), it is now easy to check that the left hand-side of (10) is identically 0, so that is semismooth and the proof is complete. 13
4 Properties of In this section we prove two important results on the function . The rst result states that if F is a P0-function then every point such that r (x) = 0 is a global minimum point of ; the second result establishes that if F is a uniform P-function then has bounded level sets. These results are stronger than analogous results obtained for other merits functions in the literature, as we already discussed in the Introduction. This follows by recalling the properties of the implicit Lagrangian and of the regularized gap function stated in the Introduction and by noting that if the Jacobian of F is everywhere positive de nite, then F is monotone, that every monotone function is a P0 -function and that every strictly monotone function is a uniform P-function.
Theorem 4.1 Suppose that F is a P0-function. Then every stationary point of is such that (x) = 0.
Proof. Suppose that r (x) = 0, this means [(A(x) ? I ) + rF (x)(B (x) ? I )](x) = 0;
(12)
we want to show that (x) = 0. Suppose the contrary. Consider the vector (B (x) ? I )(x), by its structure it is easy to see that its ith component is dierent from 0 i i (x) 6= 0. In fact, if i (x) 6= 0, (Bii (x) ? 1)i(x) can be 0 i Bii (x) = 1. But i (x) 6= 0 means that one of the following situations occurs: 1. xi 6= 0 and Fi (x) 6= 0; 2. xi = 0 and Fi (x) < 0; 3. xi < 0 and Fi (x) = 0. In every case it is obvious, by the de nition of B , that Bii (x) 6= 1, so that (Bii (x) ? 1)i(x) 6= 0. Similar reasonings can be repeated for the vector (A(x) ? I )(x). Then it is easy to verify that if (x) 6= 0, then (B (x) ? I )(x) and (A(x) ? I )(x) are both dierent from 0 and have their nonzero elements in the same positions, and such nonzero elements have the same sign. But then for (12) to hold it would be necessary that rF (x) \revert the sign" of all the nonzero elements of (B (x) ? I )(x), which, by Proposition 2.1, contradicts the fact that rF (x) is a P0 matrix (because F is a P0 -function). 2 The proof of the next theorem uses a technique which was introduced by Geiger and Kanzow [12] in order to prove the same theorem in the case of uniformly monotone functions.
Theorem 4.2 Suppose that F is a uniform P -function. Then the level sets of are bounded. Proof. We shall show that lim (xk ) = 1: (13) kx k!1 k
14
So, let fxk g be a sequence such that fkxk kg ! 1. De ne the index set J = fi : fxki g is unbounded g. Since fxk g is unbounded, J 6= ;. Let fz k g denote a bounded sequence de ned in the following way 8 < 0 if i 2 J zik = : k
xi if i 62 J: From the de nition of fz k g and the assumption on F we get
Pi2J (xki )2 = kxk ? zk k2 maxi2f1;:;ng xki ? zik Fi (xk ) ? Fi(zk ) = maxi2J xki Fi (xk ) ? Fi (z k ) = xkj Fj (xk ) ? Fj (z k ) = jxkj j Fj (xk ) ? Fj (z k ) ;
(14)
where is the positive constant of the de nition of P-function and j is one of the indices for which the max is attained which we have, without loss of generality, assumed to be independent of k. Since j 2 J , we have that fjxkj jg ! 1; (15) so that, dividing by jxkj j, (14) gives jxkj j Fj (xk ) ? Fj (z k ) ; which in turn, since Fj (z k ) is bounded, implies n k o (16) Fj (x ) ! 1: But (15) and (16) imply fj(xj ; Fj (x))jg ! 1; from which (13) readily follows. 2 In the linear case the following stronger result can be easily derived from Theorem 2.1 (c) in [36].
Theorem 4.3 Suppose that the F is ane, i.e. that F (x) = Mx+q. Then limkxk!1 (x) = 1 if and only if M is an R0 -matrix.
5 The algorithm The merit function can be used in several ways to de ne globally convergent algorithms for the solution of nonlinear complementarity problems: for example one could simply use an othe-shelf algorithm to minimize . In this section we use the merit function in a dierent, but classical way. We rst de ne a fast, local algorithm for the solution of Problem (NC). Then we globalize this local algorithm by performing an Armijo-type linesearch using the \local" direction, but reverting to the antigradient of when the \local" direction is not a good descent direction for the merit function. Note that this scheme follows exactly the same lines used in the classical stabilization scheme for Newton's method for the unconstrained minimization of a 15
twice continuously dierentiable function. The crucial point will be to show that eventually the gradient direction is never used and the stepsize of one is accepted, so that, locally, the global algorithm coincides with the local one thus ensuring a fast asymptotic convergence rate. We remark that this is neither the only way to exploit the function , nor, possibly, the best one. However, we note that the local algorithm enjoys several interesting properties and that the overall global algorithm, in spite of its simplicity, performs surprisingly well, see [6].
5.1 The local algorithm In this section we describe a local algorithm for the solution of nonlinear complementarity problems. The algorithm generates a sequence of points fxk g de ned by
xk+1 = xk + dk : To motivate the local algorithm we rst consider a simpli ed situation. Suppose that x is a solution of Problem (NC), that x is nondegenerate and that we know the sets A and N of variables which are 0 or positive at x
A := fijxi = 0g;
N := fijxi > 0g:
Then, to determine xN we would only need to solve the system of equations Fi (xN ; 0A ) = 0; i 2 N: Provided that rFNN (xN ; 0A) is nonsingular we could apply Newton's method to this system, thus setting xkN+1 = xkN + dkN ; where dkN is the solution of the following linear system (rFNN (xkN ; 0A )) dkN = ?FN (xkN ; 0A): T
(17)
Obviously, in general we do not know the sets A and N , and furthermore we would like to avoid the nondegeneracy assumption, which is often not met in practice. We then de ne dk in two steps. At each iteration we rst estimate the sets A and N , thus xing some of the components of dk , then we calculate the remaining part of dk by solving a reduced linear system. We approximate the sets A and N by the sets Ak and N k de ned by
Ak := fijxki "Fi (xk )g;
N k := fijxki > "Fi (xk )g;
where " is a xed positive constant. Exploiting continuity it is very easy to check that the following result holds.
Proposition 5.1 Suppose that x is a solution of Problem (NC). Then, for every xed " there exists a neighborhood of x such that, for every xk belonging to
Ak [ N k [ : Furthermore, if x is nondegenerate, then = Ak and = N k .
16
Based on this results it then seems reasonable to de ne dk in the following way.
dkA = ?xkA ; k
(18)
k
while dkN is the solution of the following linear system k
(rFN N (xk )) dkN = ?FN (xk ) + (rFA N (xk )) xkA : k
k
T
k
k
k
k
T
(19)
k
The de nition of dkA is very natural, since if we estimate that Ak is the set of variables which are zero at x, by (18) we obtain xkA+1 = 0: (20) Regarding (19) we note that, if x is nondegenerate, Ak = by Proposition 5.1, so that, since xkA+1 = 0 by (20), (19) reduces to (17). Roughly speaking the extra term (rFA N (xk )) xkA in (19) is needed to deal with degeneracy, this will be clearer from the proof of the following theorem, where we have collected the main properties of this local algorithm. k
k
k
k
k
T
k
Theorem 5.2 Suppose that x is a b-regular solution of Problem (NC). Then there exists a neighborhood of x such that, if xo belongs to , the algorithm de ned above is such that
a. All the linear systems which have to be solved are uniquely solvable. b. fxk g ! x. c. The converge rate of the sequence fxk g to x is at least superlinear; if the Jacobian of F is locally Lipschitzian at x, then the convergence rate is quadratic.
Proof. Let A be an index set such that
A [
(21)
and denote by N its complement, i.e. N = f1; : : :; ng n A. Consider the function
"
#
HA(x) = FN (x) : xA By (21) we have that HA(x) = 0; furthermore we can write
rHA(x) = rFNN (x) O rFAN (x) IAA
!
(22)
which clearly shows, taking into account the b-regularity assumption and (21), that rHA(x) is nonsingular. Hence we can apply Newton's method to the solution of system HA(x) = 0 and, thanks to the nonsingularity of the Jacobian at the solution x, all standard results hold: in particular there exists a neighborhood A of x such that if x0 belongs to A, the sequence fxk g determined by the Newton methods is well de ned, converges to x, and the convergence rate is at least superlinear, quadratic if the Jacobian of F is Lipschitz continuous. 17
We now note that it is readily seen, using (22), that the vector dk de ned by (18)-(19) can also be equivalently obtained as the Newton's direction for the solution of the system HAk (x) = 0, that is h i?1 dk = ? rHAk (xk ) HAk (xk ): By Proposition 5.1 we have Ak [ , so that the algorithm de ned by (18)-(19) can be seen as a sequence of Newton's steps for a nite number of functions which all have the same solution x and whose Jacobians are all nonsingular at x The theorem then follows by taking T
=
\
A: A [
A:
2
An interesting feature of the local algorithm is that, under the b-regularity assumption, it is nitely convergent in the case of linear complementarity problems.
Theorem 5.3 Suppose that x is a b-regular solution of a linear complementarity problem. Then there exists a neighborhood such that if x0 belongs to the algorithm above nds the solution x in a single step.
Proof. Take to be any neighborhood of x for which
A0 [ : Reasoning as in the proof of the previous theorem, and using the same notation introduced there, we see that d0 can be seen as the Newton's direction for the solution of the nonsingular linear system # " F 0 (x) N 0 = 0: HA (x) =
xA0
which has the unique solution x. The assertion then easily follows by the fact that Newton's method solves nonsingular linear systems in one iteration. 2 The properties reported in the two previous theorems are the natural extensions of the classical results for Newton's method for systems of smooth equations. It is worth pointing out the following points.
- No nondegeneracy assumption is needed. - Only reduced linear systems are solved at each iteration. - The points generated can violate the constraint x 0. Although very simple we think the local algorithm outlined above enjoys some interesting properties. If we compare it to the classical local linearization method of Josephy and Robinson [32, 14] we see that we have two advantages: the regularity assumption required (b-regularity) is weaker than the R-regularity assumption used in [32]; furthermore the methods described in 18
[32] require, at each iteration, the solution of a full dimensional linear complementarity problem, which is obviously a computationally more intensive task than solving a linear system. Recently Pang [23] has shown that it is possible to relax the R-regularity assumption in a JosephyRobinson scheme, however, using this weaker assumption, the linear complementarity problem that has to be solved at each iteration can have multiple solutions and a suitable one has to be selected, this is by no means an easy task. There exist other local methods which solve, at each iteration, only a (full dimensional) linear system, see, e.g., [7, 12, 15, 16, 20, 34]; however, as far as we are aware of, all these methods require nondegeneracy of the solution to get superlinear convergence. We conclude this section by pointing out a simple by-product of the proof technique used in Theorem 5.2, namely that b-regularity implies local uniqueness of the solution x. This result slightly improves on Corollary 4.7 in [19] by relaxing the twice continuous dierentiability of F used there and will be needed in the sequel.
Proposition 5.4 Suppose that a solution x of a (NC) is b-regular, then x is a locally unique
solution.
Proof. The proof is by contradiction. Suppose that we can nd a solution x~ to the complemen-
tarity problem as close as we want to x, and de ne the following index sets:
~ := fijx~i > 0g;
~ := fijx~i = 0 = Fi (~x)g;
~ := fijFi(~x) > 0g:
By continuity it is easy to check that, if x~ is suciently close to x, we have
~; ~ ; ~; from which we easily get
~ ~ [ ~ [ :
(23)
But (23) implies that we can nd a set A such that
A [ ; and ~ A ~ [ ~; which in turn, using the notation of the proof of Theorem 5.2, implies that both x and x~ are solutions of the system of equations HA(x) = 0. But we have already observed in the proof of Theorem 5.2 that the b-regularity of x implies the nonsingularity of the Jacobian rHA(x) (see (22)). So x is a locally unique solution of the system HA(x) = 0, and this contradicts the arbitrary closeness of x~, thus proving the proposition. 2
5.2 The global algorithm In this section we exploit the merit function to globalize, in a simple way, the local algorithm.
Global Algorithm Data: x0 2 IRn , " > 0, > 0, p > 1, 2 (0; 1=2), 2 (0; 1). 19
Step 0: Set k = 0 Step 1: (stopping criterion) If the stopping criterion is satis ed stop. Step 2: Calculate the \local direction" dk according to (18)-(19). If system (19) is not solvable set dk = ?r (xk ). Step 3: If (xk + dk ) (xk )
(24)
set xk+1 = xk + dk set k k + 1 and go to Step 1. Step 4: (linesearch) If dk does not satisfy the following test
r (xk ) dk ?kdkkp;
(25)
T
set dk = ?r (xk ). Find the smallest ik = 0; 1; 2; : : : such that (xk + 2?i dk ) (xk ) + 2?i r (xk ) dk k
k
T
(26)
set xk+1 = xk + 2?i dk set k k + 1 and go to Step 1. k
A few comments are in order. At Step 1 any reasonable stopping criterion can be used. Note that in our case we can use classical measures of optimality, like the norm of the vector of the residual, but also measures connected to the merit function as, for example, the norm of the gradient of . At Step 2 we try to calculate the \local" search direction de ned by (18)-(19). If this direction is not well de ned we switch to the antigradient of the merit function. Then we exploit the fact that, if the nonlinear complementarity problem is solvable, the optimal value of is 0. So, if, for some constant 2 (0; 1), test (24) is satis ed, we accept the stepsize of one. If this test is passed an in nite number of times this will obviously lead to the function value tending to zero as desired. Should test (24) not be satis ed, we perform in Step 4 a classical linesearch procedure to determine the step size. In this latter case we possibly switch to the antigradient, see test (25), in order to ensure that the search direction is \suciently" downhill. The aim of the acceptability test of Step 3 is twofold. On one hand it gives us one more chance to accept the stepsize of one, on the other hand it makes it easier to prove the superlinear converge rate of the algorithm. A test close to (24) has been proposed, with similar purposes, in [27]. To prove the convergence properties of the global algorithm we need three lemmas. The rst one is similar to a result contained in Theorem 3.1 of [27], however, we use a stronger assumption obtaining, correspondingly, a stronger result.
Lemma 5.5 Let H : IRn ! IRn be a semismooth function and let x 2 IRn be such that H (x) = 0
and such that every generalized Jacobian of H at x is nonsingular. Suppose that we are given two sequences fxk g and fdk g such that (i) fxk g ! x
20
(ii) lim kxkx+d?x?kxk = 0. Then k
k
k
kH (xk + dk )k = 0: lim k!1 kH (xk )k
(27)
Proof. Since H is semismooth at x, we can write, by Proposition 1 of [25] H (xk + dk ) = H (x) + W k (xk + dk ? x) + o(kxk + dk ? xk); 8W k 2 @H (xk + dk ); H (xk ) = H (x) + Z k (xk ? x) + o(kxk ? xk); 8Z k 2 @H (xk): Since H (x) = 0, this implies
kH (xk + dk )k = lim kW k (xk + dk ? x) + o(kxk + dk ? xk)k lim k!1 kH (xk )k k!1 kZ k (xk ? x) + o(kxk ? xk)k kW k (xk + dk ? x)k + ko(kxk + dk ? xk)k klim !1 j kZ k (xk ? x)k ? ko(kxk ? xk)k j
kW k (xk + dk ? x)k 1 + kkWo(kx(x++dd??xkx))kk = klim !1 kZ k (xk ? x)k 1 ? kkoZ(k(xx ??xxk))kk 2kW k (xk + dk ? x)k = 0; klim !1 12 kZ k (xk ? x)k k
k
k
k
k
k
k
k
where we have taken into account (ii) and the fact that, by the nonsingularity assumption on the generalized Jacobians of H at x, the sequences of matrices fW k g and fZ k g are such that there exists two positive constants cm and cM such that cm kW k k cM and cm kZ k k cM for all k, so that
kW k (xk + dk ? x)k = O(kxk + dk ? xk);
and
The chain of inequalities obviously implies the thesis.
kZ k (xk ? x)k = O(kxk ? xk): 2
Lemma 5.6 Let x be an R ? regular solution of the nonlinear complementarity problem, and suppose that we are given two sequences fxk g and fdk g such that (i) fxk g ! x
(ii) lim kxkx+d?x?kxk = 0. Then k
k
k
(xk + dk ) = 0: lim k!1 (xk )
Proof. By Proposition 3.5 and Proposition 3.2 we can apply Lemma 5.5 to the system (x) = 0. 2
The assertion then follows by squaring (27).
Lemma 5.7 Suppose that fxk g and fdk g is a subsequence of points and corresponding directions generated by the Global Algorithm. Then if fxk g ! x and fdk g ! 0, then r (x) = 0. 21
Proof. Suppose that dk = ?r (xk ) for an in nite number of indices k; in this case the lemma
trivially follows by the continuity of the gradient of . So, without loss of generality, we examine the case in which dk is always generated according to (18)-(19). Furthermore we can also assume, subsequencing if necessary, that Ak = A and N k = N , i.e. the sets Ak and N k are independent of the iteration. Since we are assuming fdk g ! 0, (18) implies
xA = 0;
(28)
this, in turn, implies by (19) and the boundedness of rFNN (xk ), that
FN (x) = 0:
(29)
By continuity and by the de nition of the sets A and N , (28) and (29) imply
xN 0;
FA (x) 0:
(30)
Equations (28), (29) and (30) imply that x is a solution of the nonlinear complementarity problem, so that it is a (global) minimum point of and hence r (x) = 0. 2 We can now prove the main result of this section.
Theorem 5.8 It holds that a. Each accumulation point of the sequence fxk g generated by the algorithm is a stationary point of .
b. If one of the limit points of the sequence fxk g is a b-regular solution of Problem (NC), then fxk g ! x. c. If fxk g ! x and x is an R-regular solution of Problem (NC) and rF is locally Lipschitzian in a neighborhood of x, then
1. Eventually dk is always the \local" direction de ned in the previous subsection (i.e. the antigradient is never used eventually) 2. Eventually the stepsize of one is always accepted so that xk+1 = xk + dk . 3. The convergence rate is quadratic.
Proof. (a) The proof is by contradiction. Suppose, renumbering if necessary, that fxk g ! x and that r (x) 6= 0; then we can assume without loss of generality that test (24) is never passed and that
0 < kdk k D: (31) In fact if, for a certain subsequence of points, test (24) is passed this would imply, recalling that at each step (xk+1 ) < (xk ), that f (xk )g ! 0, so that x is a global minimum point and hence r (x) = 0. On the other hand if, for some subsequence K , fkdk kg ! 0, we have that r (x) = 0 by Lemma 5.7, while kdkk cannot be unbounded because, taking into account that r (xk ) is bounded and p > 1, this would contradict (25). 22
Then, since at each iteration (26) holds and is bounded from below on the bounded sequence fxk g we have that f (xk+1 ) ? (xk )g ! 0 which implies, by the linesearch test,
f2?i r (xk ) dk g ! 0: k
(32)
T
We want to show that 2?i is bounded away from 0. Suppose the contrary. Then, subsequencing if necessary, we have that f2?i g ! 0 so that at each iteration the stepsize is reduced at least once and (26) gives (xk + 2?(i ?1) dk ) ? (xk ) > r (xk ) dk : (33) 2?(i ?1) By (31) we can assume, subsequencing if necessary, that fdk g ! d 6= 0, so that, passing to the limit in (33), we get r (x) d r (x) d: (34) On the other hand we also have, by (25), that r (x) d ?kdkp < 0, which contradicts (34); hence 2?i is bounded away from 0. But then (32) and (25) imply that fdk g ! 0 so that r (x) = 0 by Lemma 5.7 and point (a) is proved. (b) Since x is a b-regular solution then x is an isolated global minimum point of by Proposition 5.4. Denote by the set of limit points of the sequence fxk g; we have that x belongs to which is therefore a nonempty set. Let be the distance of x to n x, if x is not the only limit point of fxk g, 1 otherwise, i.e. 8 < distfxj n xg if n x 6= ; =: 1 otherwise; since x is an isolated solution > 0. Let us now indicate by 1 and 2 the following sets, k
k
k
T
k
T
T
T
k
1 = fx 2 IRn : dist(xj g =4g;
2 = fx 2 IRn : kxk kxk + g:
We have that for k suciently large, let us say for k k, xk belongs at least to one of the two sets 1 and 2 . Let now K be the subsequence of all k for which kxk ? xk =4 (this set is obviously nonempty because x is a limit point of the sequence). Since all points of the subsequence fxk gK are contained in the compact set S (x; =4) and every limit point of this subsequence is also a limit point of fxk g, we have that all the subsequence fxk gK converges to x, the unique limit point of fxk g in S (x; =4). Since x is b-regular, taking into account that r (x) = 0 and the de nition of dk , we have that fdk g !K 0. So we can nd k~ k such that kdk k =4 if k 2 K and k k~. Let now k^ be any xed k k~ belonging to K ; we can write:
distfxk^+1j n xg inf y2 nxfky ? xkg ? (kx ? xk^k + kxk^ ? xk^+1k) ? =4 ? =4 = =2:
(35)
This implies that xk^+1 cannot belong to 1 n S (x; =4); on the other hand, since xk^+1 = xk^ + k^ dk^ for some k^ 2 (0; 1], we have
kxk^+1k kxk^k + kk^dk^k kx + (xk^ ? x)k + kdk^k kxk + kxk^ ? xk + kdk^k kxk + =4 + =4; 23
so that xk^+1 does not belong to 2 . Hence we get that xk^+1 belongs to S (x; =4). But then, by de nition, we have that k^ + 1 2 K , so by induction (recall that k^ + 1 > k~ also, so that kdk^+1k =4) we have that every k > k~ belongs to K and the whole sequence converges to x. (c) Since x is R-regular we have that it is also b-regular, so that the local direction (18)-(19) is well de ned. The three assertions then easily follow by Theorem 5.2, test (24) and Lemma 5.6.
2
We note that in general what we can guarantee is that every limit point x (if any) is a stationary point of . If (x) = 0 then x is also a solution of the nonlinear complementarity problem. According to what seen in Sect. 3, we can be sure that every limit point of the sequence generated by the algorithm is a solution of Problem (NC) is F is a P0 ?function. If F is a uniform P -function, we can also guarantee the existence of a limit point. Actually in this latter case, it is elementary to show that the whole sequence converges to the unique solution of the complementarity problem. It is also possible to give conditions on the function F only at x which guarantee that x is a solution of the nonlinear complementarity problem; this leads to an analysis similar to the one carried out in, e.g., [8, 22, 24]. The most obvious of this conditions is: rF (x) is a P0 -matrix, as can be easily seen from the proof of Theorem 4.1. However we do not pursue this kind of analysis here and leave it for future research. In the linear case Theorem 5.3 and Theorem 5.8 readily give the following result. Theorem 5.9 Suppose that the global algorithm of this section is applied to a linear complementarity problem, and that one of the limit points, say x, of the sequence generated by the algorithm is a b-regular solution. Then the algorithm converges in a nite number of steps to x. In particular, in the case of linear complementarity problems, with F (x) = Mx + q , we have, by Theorem 4.1 and Theorem 4.2, that the algorithm converges to the unique solution of the problem in a nite number of steps if M is a P -matrix.
6 Conclusion We have studied the properties of a new merit function which allows to reduce a nonlinear complementarity problem to an unconstrained minimization one under conditions weaker than those previously known. Based on this merit function we have also de ned a globally and superlinearly convergent algorithm for the solution of the nonlinear complementarity problem. The new algorithm has a low cost per iteration if compared to algorithms with similar characteristics and its properties are established under very mild assumptions; the numerical results reported in [6] are very encouraging. We think that these results along with those reported in [12, 36] indicate that the function is a very valuable tool in the solution of nonlinear complementarity problems. In particular we feel that the semismoothness of and the SC1 property of have not been fully exploited yet, even if we think that, following recent results reported in [5, 25, 27, 29], they could lead to extremely interesting algorithms; we are currently investigating on these topics and hope to report on this research in the next future. 24
References [1] G. Auchmuty. Variational principles for variational inequalities. Numerical Functional Analysis and Optimization, 10, pp. 863{874, 1989. [2] F.H. Clarke. Optimization and Nonsmooth Analysis. John Wiley & Sons, New York, 1983. [3] R.W. Cottle, J.-S. Pang and R.E. Stone. The Linear Complementarity Problem. Academic Press, New York, 1992. [4] S.P. Dirkse and M.C. Ferris. The PATH solver: A non-monotone stabilization scheme for mixed complementarity problems. Report 1179. University of Wisonsin, Madison, Wisconsin, USA, 1993. [5] F. Facchinei. Minimization of SC1 functions and the Maratos eect. DIS Technical Report 10.94. To appear in Operations Research Letters. [6] F. Facchinei and J. Soares. Testing a new class of algorithms for nonlinear complementarity problems. To appear in Variational inequalities and network equilibrium problems, F. Giannessi editor, Plenum Press. [7] M. Ferris and S. Lucidi. Globally convergent methods for nonlinear equations. Journal of Optimization Theory and Applications, 81, pp. 53{71, 1994. [8] M. Ferris and D. Ralph. Projected gradient methods for nonlinear complementarity problems via normal maps. To appear in Recent Advances in Nonsmooth Optimization, World Scienti c Publishers. [9] A. Fischer. A special Newton-type optimization method. Optimization, 24, pp. 269{284, 1992. [10] A. Fischer. A special Newton-type method for positive semide nite linear complementarity problems. Technical Report, Institute of Numerical Mathematics, Dresden University of Technology, Dresden, Germany, 1992. To appear in Journal of Optimization Theory and Applications. [11] M. Fukushima. Equivalent dierentiable optimization problems and descent methods for asymmetric variational inequality problems. Mathematical Programming, Series A, 53, pp. 99{110, 1992. [12] C. Geiger and C. Kanzow. On the resolution of monotone complementarity problems. Preprint 82, Institute of Applied Mathematics, University of Hamburg, Hamburg, Germany, 1994. [13] P.T. Harker and B. Xiao. Newton's method for the nonlinear complementarity problem: A B-dierentiable equation approach. Mathematical Programming, Series A, 48, pp. 339{357, 1990. [14] N.H. Josephy. Newton's methods for generalized equations. MRC Technical Summary Report 1965, Mathematics Research Center, University of Wisconsin-Madison, Wisconsin, 1979. [15] C. Kanzow. Some equation-based methods for the nonlinear complementarity problem. Optimization Methods and Software, 3, pp. 327{340, 1994. 25
[16] C. Kanzow. Nonlinear complementarity as unconstrained optimization. Preprint 67, Institute of Applied Mathematics, University of Hamburg, Hamburg, Germany, 1993. To appear in Journal of Optimization Theory and Applications. [17] C. Kanzow. Global convergence properties of some iterative methods for linear complementarity problems. Preprint 72, Institute of Applied Mathematics, University of Hamburg, Hamburg, Germany, 1993, revised 1994. To appear in SIAM Journal on Optimization. [18] O.L. Mangasarian. Equivalence of the complementarity problem to a system of nonlinear equations. SIAM Journal on Applied Mathematics, 31, pp. 89{92, 1976. [19] O.L. Mangasarian. Locally unique solutions of quadratic programs, linear and nonlinear complementarity problems. Mathematical Programming, 19, pp. 200{212, 1980. [20] O.L. Mangasarian and M.V. Solodov. Nonlinear complementarity as unconstrained and constrained minimization. Mathematical Programming, Series B, 62, pp. 277{297, 1993. [21] R. Miin. Semismooth and semiconvex functions in constrained optimization. SIAM Journal on Control and Optimization, 15, pp. 957{972, 1977. [22] J.J. More. Global methods for nonlinear complementarity problems. Preprint MCS-P4290494. Argonne National Laboratory, Mathematics and Coputer Science Division, Argonne, Illinois, USA, 1994. [23] J.-S. Pang. Convergence of splitting and Newton methods for complementarity problems: an application of some sensitivity results. Mathematical Programming, Series A, 58, pp. 149{ 160, 1993. [24] J.-S. Pang and S.A. Gabriel. NE/SQP: A robust algorithm for the nonlinear complementarity problem. Mathematical Programming, Series A, 60, pp. 295{337, 1993. [25] J.-S. Pang and L. Qi. Nonsmooth equations: motivation and algorithms. SIAM Journal on Optimization, 3, pp. 443{465, 1993. [26] J.-S. Pang and L. Qi. A globally convergent Newton methods for convex SC1 minimization problems. To appear in Journal of Optimization Theory and Applications. [27] L. Qi. A convergence analysis of some algorithms for solving nonsmooth equations. Mathematics of Operations Research., 18, pp. 227{244, 1993. [28] L. Qi and H. Jiang. Karush-Kuhn-Tucker equations and convergence analysis of Newton methods and Quasi-Newton methods for solving these equations. Technical report AMR 94/5, School of Mathematics, University of New South Wales, Australia, 1994. [29] L. Qi and J. Sun. A nonsmooth version of Newton's methods. Mathematical Programming, Series A, 58, pp. 353{368, 1993. [30] D. Ralph. Global convergence of Damped Newton's method for nonsmooth equations via the path search. Mathematics of Operations Research, 19, pp. 352{389, 1994. [31] S.M. Robinson. Strongly regular generalized equations. Mathematics of Operations Research, 5, pp. 43{62, 1980. [32] S.M. Robinson. Generalized equations. in Mathematical programming: the state of the art, A. Bachem, M. Groetschel and B. Korte editors, pp. 346{367, Springer-Verlag, Berlin, 1983. 26
[33] S.M. Robinson. Normal maps induced by linear trasformations. Mathematics of Operations Research, 5, pp. 43{62, 1980. [34] P.K. Subramanian. Gauss-Newton methods for the complementarity problem. Journal of Optimization Theory and Applications, 77, pp. 467{482, 1993. [35] K. Taji, M. Fukushima and T. Ibaraki. A globally convergent Newton method for solving strongly monotone variational inequalities. Mathematical Programming, Series A, 58, pp. 369{383, 1993. [36] P. Tseng. Growth behavior of a class of merit functions for the nonlinear complementarity problem. Manuscript, 1994. [37] B. Xiao and P.T. Harker. A nonsmooth Newton method for variational inequalities, I: theory. Mathematical Programming, Series A, 65, pp. 151{194, 1994. [38] B. Xiao and P.T. Harker. A nonsmooth Newton method for variational inequalities, II: numerical results. Mathematical Programming, Series A, 48, pp. 195{216, 1994. [39] N. Yamashita and M. Fukushima. On stationary points of the implicit Lagrangian for nonlinear complementarity problems. Information Science Technical Report, Nara Institute of Science and Technology, Nara, Japan, 1993. To appear in Journal of Optimization Theory and Applications.
27