A Parameterized Newton Method and A Quasi ... - Semantic Scholar

Report 4 Downloads 53 Views
A Parameterized Newton Method and A Quasi-Newton Method for Nonsmooth Equations

1

Xiaojun Chen and Liqun Qi School of Mathematics University of New South Wales P.O. Box 1, Kensington, NSW 2033, Australia (Revised in October, 1993)

ABSTRACT. This paper presents a parameterized Newton method using

generalized Jacobians and a Broyden-like method for solving nonsmooth equations. The former ensures that the method is well-de ned even when the generalized Jacobian is singular. The latter is constructed by using an approximation function which can be formed for nonsmooth equations arising from partial di erential equations and nonlinear complementarity problems. The approximation function method generalizes the splitting function method for nonsmooth equations. Locally superlinear convergence results are proved for the two methods. Numerical examples are given to compare the two methods with some other methods.

1. Introduction Let F : Rn ! Rn be continuous but not necessarily di erentiable. We consider the system of nonlinear equations

F (x) = 0; x 2 Rn: (1) Such systems of nonsmooth equations have attracted attention in both optimization and numerical analysis. An interesting point is that the work in these two areas were somewhat separated before this paper. Di erent types of problems give rise to nonsmooth equations in these two areas. Nonsmooth equations studied in optimization arise from nonlinear complementarity, variational inequality, nonlinear programming and maximal monotone operator 1

This work is supported by the Australian Research Council.

1

problems [23]. Nonsmooth equations studied in numerical analysis arise from numerical solution of nonsmooth partial di erential equations [1][3][13] and nonsmooth compact xed point problems [13]. Iterative methods for solving nonsmooth equations may be classi ed into ve classes: 1) generalized Jacobian methods; 2) iteration function methods; 3) splitting function methods; 4) quasi-Newton methods; 5) minimization methods. This classi cation may not be complete, but it helps summarize the current state of algorithmic development. Generalized Jacobian methods use a generalized Jacobian of F to play the role of F 0 in the classic Newton method. Let F be locally Lipschitzian. Then Rademacher's theorem implies that F is almost everywhere di erentiable. Let DF be the set where F is di erentiable, and let

@B F (x) = f xlim 5F (xi)g: i !x xi 2DF

The generalized Jacobian of F at x 2 Rn in the sense of Clarke [5] is

@F (x) = conv@B F (x); which is a nonempty convex compact set. The generalized Jacobian based Newton method is de ned by

xk+1 = xk ? Vk?1 F (xk );

Vk 2 @F (xk ):

(2)

Recently, Qi [24] suggested a modi ed version (2) where

xk+1 = xk ? Vk?1F (xk );

Vk 2 @B F (xk ):

(3)

This modi cation reduces the nonsingularity requirement on members of @F (xk ) to members of @B F (xk ). The method (2) was studied in [16]. Superlinear convergence for (2) and (3) was established in [24] and [27], under the condition that F is semismooth. A generalized Dennis-More theorem was proved in [23]. 2

The iteration function method may be traced back to the work of Robinson [29]. He introduced a point-based approximation A : Rn  Rn ! Rn to F . That is, for each y the function A(; y) is an approximation to F around y. The next iterate is determined by solving A(x; xk ) = 0 (4) for xk+1 . Usually (4) is nonlinear. This method was also called Newton's method in [29] and was further developed in [28]. We call it an iteration function method to distinguish it from (2) and (3) where a linear subproblem is solved. Pang [20] suggested the method ( solve F (xk ) + F 0(xk ; d) = 0; for dk (5) set xk+1 = xk + dk ; under the assumption that F is directionally di erentiable. This may also be classi ed as an iteration function method. Han, Pang and Rangaraj [10] introduced a function G(; ) : Rnn ! Rn and generalized (5) to ( solve F (xk ) + G(xk ; d) = 0; for dk (6) set xk+1 = xk + dk : Combining (6) with a line search, global convergence was established [10]. There are some diculties with the above methods. At each step in (2) or (3), a generalized Jacobian and its inverse are needed, so the method fails when the generalized Jacobian is singular at any xk . In (4), (5) or (6) a nonlinear system needs to be solved at each step. However its solvability is not assured (see Section 5 for an example). A third type of method deals with the nonsmoothness by splitting the nonsmooth function into smooth and nonsmooth parts. That is, F is split as F =  + where  is continuously di erentiable and is continuous, nondifferentiable but relatively small. Convergence theorems and error estimation for the Krasnoselskii-Zincenko iteration xk+1 = xk ? 0(xk )?1F (xk ) (7) were given in [19,32,34]. Recently, Chen and Yamamoto [1,3,4,33] discussed the Newton-like method xk+1 = xk ? A(xk )?1F (xk ); (8) 3

where A(xk ) 2 Rnn is an approximation to 0(xk ). Let L = [l; l] be an interval matrix satisfying (xk ) ? (xk?1) 2 L(xk ? xk?1); where l; l 2 Rnn and l  l (li;j  li;j ; i; j = 1; :::; n). The method (8) with A(xk ) = 0(xk ) + lk , lk 2 L was discussed in [3]. Another particular case of (8) is the Broyden-like method [1]:

xk+1 = xk ? Bk?1 F (xk ) ( t k ? Bk sk )sTk (9) Bk+1 = Bk ? sTk sk sk = xk+1 ? xk ; tk = (xk+1) ? (xk ): Local convergence and semi-local convergence were established in [1]. The de nition of semi-local convergence can be found in [19]. However, it seems that the splitting technique has not been applied to nonsmooth equations arising from optimization problems. For example, consider the nonlinear complementarity problem (NCP): Find x 2 Rn such that f (x)  0; g(x)  0; f (x)T g(x) = 0; where f and g are two continuously di erentiable mappings from Rn to Rn: The NCP can also be formulated as a system of smooth nonlinear equations [7] or a di erentiable optimization problem [8]. However, singularity and other diculties may appear there. Quasi-Newton methods are ecient for solving smooth nonlinear equations [6,19]. Their application to nonsmooth equations has been considered in [1,3,14,15]. In [3] [14], quasi-Newton methods such as the Broyden method xk+1 = xk ? B~k?1 F (xk ) ~k ? B~k sk )sTk ( t ~ ~ (10) Bk+1 = Bk ? sTk sk sk = xk+1 ? xk ; t~k = F (xk+1) ? F (xk ) were directly applied to (1). Superlinear convergence theorems were established under the assumption that F is strongly F-di erentiable at the solution 4

[14]. This is too restrictive for nonsmooth equations. In [15] quasi-Newton methods were applied to piecewise smooth equations. For each piece, a new starting iteration matrix was used, which is not ecient. Obviously, a vector x solves F (x) = 0 if and only if it is a global minimum point of the problem: minjjF (x)jj, and jjF (x)jj = 0. The minimization methods for minimizing jjF (x)jj can be generally written as

xk+1 = xk ? !k pk ; where !k is the steplength and pk is a direction vector. The sequence fxk g satis es jjF (xk+1)jj  jjF (xk )jj: The damped Gauss-Newton method is a typical minimization method. Global convergence of the damped Gauss-Newton method for solving nonsmooth equations was established by Pang and Gabriel [22]. Recently Gabriel and Pang presented a trust region method for solving nonsmooth equations [9]. In this paper, we consider two modi cations of methods (3) and (10) for solving the system of nonsmooth equations (1). In Section 2 we consider a parameterized Newton method

xk+1 = xk ? k (Vk + k I )?1F (xk ); Vk 2 @B F (xk )

(11)

where I is the n  n identity matrix and parameters k and k are chosen to ensure that fxk g converges and Vk + k I is invertible, respectively. A simple strategy is to set k  2 (0; 1] a constant, k = 0 if Vk is nonsingular and, k =k Vk k + if Vk is singular, where  is a small positive number. Locally superlinear convergence for the method (11) is proved in Section 2. The method (11) extends the modi ed Newton methods (21) and (23) for smooth equations in Section 7.1 of [19]. Another way to overcome singularity is to use a generalized inverse or an outer inverse [2,18]. In Section 3 we apply a splitting technique to the nonsmooth equation

F (x) = min(g(x); f (x)) = 0;

(12)

where the \ min " operator denotes the componentwise minimum of two vectors. The equation (12) is a \min" formulation of the NCP. Using a splitting of F we solve the NCP by (7) and (9), and other splitting methods. 5

In Section 4 we generalize the splitting technique by introducing an approximation function. Let P : Rn  Rn ! Rn be a given function, such that for each y 2 Rn , Py ()  P (; y) is a di erentiable approximation function to F around y and Px(x) = F (x). Some further conditions on P will be given in Section 4. Let Pk (x)  Pxk (x). Consider the following Broyden-like method

xk+1 = xk ? Bk?1 F (xk ) Bk+1 = Bk ? (tk ?sBT sk sk )sk

T

k k (xk+1) ? P

(13)

k sk = xk+1 ? xk ; tk = Pk k (x ): We give a necessary and sucient condition for superlinear convergence of (13) and examples of functions possessing approximation functions P . In a certain sense, this method combines the ingredients of the iteration function method, the splitting function method and the quasi-Newton methods. In Section 5 we present some computational results. We compare methods (9), (11) and (13) with methods (3), (6) and (10) on the Kojima-Shindo fourvariable problem [15]. We test the method (13) on the Hansen-Koopmans invariant capital stock problem [11,22], the Nash-Cournot production problem [12,22] and the Walrasian production-price problem [30,22].

As F is locally Lipschitz it follows from [5] and [31] that F is F -di erentiable at x if and only if it is G-di erentiable at x and that F is B-di erentiable at x if and only if it is directionally di erentiable at x. Let S (x; r) denote an open ball in Rn with center x and radius r:

2. Semismoothness and a parameterized Newton method In this section we discuss convergence of the parameterized Newton method (11). De nition 1. A function F : Rn ! Rn is said to be B-di erentiable at a point x 2 Rn if it is directionally di erentiable at x and F (x + h) ? F (x) ? F 0(x; h) = 0: lim (14) h!0 khk

2

6

We may write (14) as F (x + h) = F (x) + F 0(x; h) + o(k h k): De nition 2. A function F : Rn ! Rn is said to be semismooth at a point x 2 Rn if F is locally Lipschitzian at x and lim fV h0g (15) v2@F x th0 h0 !h;t#0 ( +

)

exists for any h 2 Rn . 2 Many practical functions are semismooth [24, 27]. If F is semismooth at x, then F is directionally di erentiable at x and F 0(x; h) is equal to the limit in (15).

Theorem 1. Suppose that x is a solution of (1), F is semismooth at x and all V 2 @B F (x) are nonsingular. Let k V k and k V?1 k for all V 2 @B F (x ). Let , k and k satisfy 0 <  k  1;  < ; 0 < ((2 + ) + (1 ? )) < 1 and

j k j ^ < 1 ? ((2 + 2 ) + (1 ? )) :

Then there exists a positive number r such that for any x0 2 S (x ; r), the sequence fxk g de ned by (11) is well-de ned and converges linearly to x . Furthermore, if k ! 1 and k ! 0 as k ! 1, then fxk g converges superlinearly to x .

Proof. First we claim that there is a positive number r such that for any x 2 S (x ; r), and Vx 2 @B F (x), we have k F (x) ? F (x ) ? F 0(x ; x ? x ) k<  k x ? x k (16) k Vx(x ? x ) ? F 0(x ; x ? x ) k<  k x ? x k (17) k Vx ? V k< ; for a V 2 @B F (x): (18) Inequality (16) is from the de nition of B-di erentiability. Inequality (17) is from Lemma 2.2 in [24]. If (18) is not true, then there is a sequence fyk j yk 2 DF g with yk ! x , such that k 5F (yk) ? V k ; for all V 2 @B F (x): (19) 7

By passing to a subsequence, we may assume that f5F (yk)g converges to a V 2 @B F (x ): This contradicts (19). Hence (18) holds, and k Vx + I ? V k + j  j< 1= ; if j  j< ^: This implies that Vx + I is nonsingular and

k (Vx + I )?1 k 1 ? ( + j  j) : Therefore (11) is well-de ned for xk 2 S (x ; r) and j k j ^: Furthermore, (18) implies k Vk k + k V k  + . Now,

k xk+1 ? x k  k (Vk + k I )?1 k ( k k F (xk ) ? F (x) ? F 0(x ; xk ? x ) k + k k Vk (xk ? x ) ? F 0(x; xk ? x ) k +((1 ? k ) k Vk k + j k j) k xk ? x k)  1 ? ( + j  j) (2 k  + (1 ? k )( + )+ j k j) k xk ? x k k k    k x ? x k; where 0 <  = 1? ( +^) (2  + (1 ? )( + ) + ^) < 1: Hence the sequence fxk g de ned by (11) converges linearly to x . By (14) and Lemma 2.2 in [24],

k F (x) ? F (x ) ? F 0(x ; x ? x ) k= o(k x ? x k) and

k Vx(x ? x ) ? F 0(x; x ? x ) k= o(k x ? x k): Letting k ! 1 and k ! 0 as k ! 0, we have k xk+1 ? x k= o(k xk ? x k): Hence the sequence fxk g de ned by (11) converges superlinearly to x , if k ! 1 and k ! 0 as k ! 1. 2 3. Splitting technique 8

Splitting methods have been used to solve the nonsmooth equations[1,3,4,13,17,3234]: F (x) = (x) + (x) = 0; x 2 Rn where  is continuously di erentiable, and is continuous, nondi erentiable and relatively small. Such nonsmooth equations arise from numerical solution of nonsmooth partial di erential equations. For example, consider the nonsmooth partial di erential equation:

(

? 4 u + (u) = '(x; y); in a domain  R2

u = q(x; y); on the boundary @ ;

where is not di erentiable. Discretizing the partial di erential equation by a nite di erence method or a nite element method, we obtain a system of nonsmooth equations

F (x) = Ax + (x) = 0; x 2 Rn where A is an n  n matrix (see [3]). If we know a splitting of F , methods (7), (8) and (9) can be used to solve F (x) = 0. However, the existence of such a splitting is not obvious in general. In this section we split F (x) = min(g(x); f (x)) into smooth and nonsmooth parts. Using the splitting of F , we can solve (12) by (7), (8), (9) and other splitting methods. Let  > 0 and ^i(x) = gi(x) ? 4fi(x) + 2 fi (x) + fi(x) ? 4gi(x) + 2 gi(x) ? 4 ; ^i (x) = 1 (j fi(x) ? gi(x) j ?)2 ; 4

(

gi(x) j>  i(x) = F^i((xx)) ifif jj ffi ((xx)) ? ? gi(x) j ; i i ( 0 if j fi (x) ? gi(x) j>  i (x) = ^i (x) if j fi (x) ? gi(x) j ; for i = 1; 2; :::; n: 9

Then it is easy to verify that

F (x) = min(f (x); g(x)) = (x) + (x); x 2 Rn;

(20)

where  is continuously di erentiable, is continuous and k k1 , i.e. is relatively small. In this construction, we use a xed  > 0. A possible improvement is to let  = k and k # 0, i.e., the splitting is changed at each step of the method. The convergence of such variants was investigated in [25,26].

4. An approximation function and a Broyden-like method The splitting function method is applied to the NCP in Section 3. However, some nonsmooth functions may not be able to split into a smooth part and a relatively small nonsmooth part. In this section, we generalize the splitting technique by introducing approximation functions. Using approximation functions, we prove the convergence of (13).

De nition 3. We call a function P : Rn  Rn ! Rn a point-based smooth

approximation (PBSA) function of F if (i) for any x 2 Rn, Px(x)  F (x); (ii) for each xed y 2 Rn, there is a number ry > 0 such that Py ()  P (; y) is continuously di erentiable in S (y; ry ); (iii) for each xed x 2 Rn, there is a number rx > 0 such that P(x)  P (x; ) is continuous in S (x; rx); (iv) for each xed x 2 Rn, there is a number r^x > 0 such that @x@ P (x; )  P0(x) is continuous in S (x; r^x).

2 Proposition 1. Let F : Rn ! Rn be continuously di erentiable. De ne P

by:

Py (x)  F (x); for all x; y 2 Rn: Then P is a PBSA function of F . 2 Proposition 2. Let F =  + , where  is continuously di erentiable and is continuous. Then P de ned by Py (x) = (x) + (y) 10

is a PBSA function of F . Proof. Obviously Px(x) = (x) + (x) = F (x): For each xed y 2 Rn, P (; y) = () + (y) is continuously di erentiable in Rn. For each x 2 Rn, @ P (x; ) = 0(x) P (x; ) = (x) + () and @x are continuous in Rn.

2

By Proposition 2, we may use (20) to induce a PBSA function of F de ned by (12). Alternatively, we may use a linear form as follows.

Proposition 3. Let  be a small positive number and p^i (x; y) = gi(y) ?2fi(y) +  fi(x) + fi(y) ?2gi(y) +  gi(x); i = 1; 2; :::; n: Let the i-th component pi(x; y) of Py (x) be

(

gi(y) j>  pi(x; y) = Fp^ i((x;x)y) ? p^ (y; y) + F (y) ifif jj ffi((yy)) ? i i i i ? gi (y ) j : Then P is a PBSA function of F with F de ned by (12). Proof. We consider pi. Clearly, for any x 2 Rn; pi (x; x) = Fi (x). Let y be a vector in Rn. If j fi(y) ? gi(y) j> , then there is a ry > 0 such that for any x 2 S (y; ry ), j fi(x) ? gi(x) j>  and ( < gi(y) pi(; y) = fgi((xx)) ifif fgi((yy)) < fi(y): i i If j fi(y) ? gi(y) j ; then pi (; y) = p^i(; y) ? p^i (y; y) + F (y) is continuously di erentiable in Rn. Hence the condition (ii) of De nition 3 holds. 11

Let x be a vector in Rn. If j fi(x) ? gi(x) j> ; then there exists a ball S (x; rx) such that for any y 2 S (x; rx), j fi(y) ? gi(y) j> : Similarly if j fi(x) ? gi(x) j< . Hence pi(x; ) = Fi(x) is continuous in S (x; rx). Now assume j fi (x) ? gi(x) j= . Say fi(x) = gi(x) + , then we have Fi(x) = gi(x) and Fi(x) ? p^i(x; y) = ? gi(y) ? 2fi(y) +  : Hence, ( ? gi() j> ; pi(x; ) = ggi((xx)) + gi()?fi ()+ ? p^ (; ) + F () ifif jj ffi(()) ? gi() j ; i i i i 2 is continuous in Rn. Therefore the condition (iii) of De nition 3 holds. Now we show that the condition (iv) of De nition 3 holds. Since @ p^ (x; y) = 1 (f 0(x) + g0(x)) + 1 (f 0(x) ? g0 (x))(g (y) ? f (y)); i i i i @x i 2 i 2 i we have that @x@ p^i(x; ) is continuous in Rn for any xed x 2 Rn and @ p^ (x; y); if j f (y) ? g (y) j= : Fi0(x) = @x i i i Hence for each xed x 2 Rn; ( @ p (x; ) = Fi0(x) if j fi() ? gi() j> ; i @ @x @x p^i (x; ) if j fi () ? gi () j ; is continuous in Rn. 2 Remark 1. We can construct di erent PBSA functions for a given function F . For example, the following two functions are also PBSA functions of F de ned by (12). (1) Let p^i (x) = gi(x) ? 4fi(x) + 2 fi(x) + fi (x) ? 4gi(x) + 2 gi(x); i = 1; 2; :::; n: De ne the i-th component pi(x; y) of Py (x) by

(

gi(y) j> ; pi(x; y) = Fp^ i((xx)) ? p^ (y) + F (y) ifif jj ffi((yy)) ? ? gi(y) j : i i i i 12

(2) Let

p^i(x; y) = gi(y) ?2fi(y) +  fi(x) + fi(y) ?2gi(y) +  gi(x) ? ? j fi(y2)? gi(y) j j fi(y) ? gi(y) j; i = 1; 2; :::; n: De ne the i-th component pi(x; y) of Py (x) by ( gi(y) j> ; pi(x; y) = Fp^ i((x;x)y) ifif jj ffi((yy)) ? ? gi(y) j : i i Remark 2. The idea of the PBSA function is a generalization of the idea of the splitting function. As Proposition 2 shows, a splitting of F can be easily transformed to a PBSA function of F . However, the inverse relation is not always true. To prove convergence of the Broyden-like method (13), we need four lemmas in [6]. Lemma 1 [6]. Suppose that s 2 Rn; s 6= 0, and I is the n  n identity matrix. Then we have T k I ? sssT s k= 1: 2 Lemma 2 [6]. Let s 2 Rn be nonzero, E 2 Rnn and let k  kF denote the Frobenius norm. Then T k )2)1=2 k E (I ? sssT s ) kF = (k E k2F ?( kkEs sk k )2:  k E kF ? 2 k E1 k ( kkEs sk F

2

Lemma 3 [6]. Let xk 2 Rn; k = 0; 1; :::. If fxk g converges superlinearly to x 2 Rn, then k xk ? x k = 1: lim k!1 k xk+1 ? xk k 2

13

Lemma 4 [6]. Assume that P0 () is continuous at x and P0 (x ) is nonsingular, then fxk g converges superlinearly to x and P(x ) = 0 if and only if

k P (xk+1) k = 0: lim k!1 k xk+1 ? xk k

2 Theorem 2. Let x be a solution of (1) and P be a PBSA function of F: Suppose that P0 (x ) is nonsingular and for any x; y 2 S (x; r^), k P0 (x)?1 (B0 ? P0 (x )) k b; k P0 (x )?1(P0 (x) ? P0 (x )) k K k x ? x k; k P0 (x )?1(Px(x) ? P(x)) k e1 k x ? x k; k P0 (x )?1(Py0 (x) ? Px0 (x)) k e2 k x ? y k; 3b + 2e1 < 1 and r = 2(1 ? 3b ? 2e1)  r^: 11K + 18e2 0  Then for any x 2 S (x ; r), the sequence fxk g generated by the method (13) converges to x and satis es k xk+1 ? x k q k xk ? x k; where q = 1=2: Furthermore, fxk g converges superlinearly to x if and only if

k P0 (x )?1(Pk (xk ) ? P (xk )) k= o(k xk ? x k): Proof. By induction on k, we prove that for any k  1 k xk ? x k q k xk?1 ? x k and

(21) (22)

k P0 (x )?1(Bk ? P0 (x )) k q1 ; (23) where q1 = 31 (1 ? Kr ? 2e1). Let k = 1. From the Banach Perturbation

Lemma, B0?1 exists and

k B0?1P0 (x ) k 1 ?1 b : 14

From x0 2 S (x; r), we have x1 ? x = x0 ? B0?1 (F (x0) ? F (x )) ? x = B0?1(B0 (x0 ? x ) ? (P0(x0 ) ? P(x ))) = B0?1((B0 ? P0 (x ) + P0 (x ))(x0 ? x ) + P (x) ? P(x0 ) + P (x0) ? P0(x0 )) = B0?1P0 (x)P0 (x)?1 ((B0 ? P0 (x ))(x0 ? x ) Z1 + (P0 (x ) ? P0 (x + t(x0 ? x )))(x0 ? x )dt + P(x0 ) ? P0(x0 )) 0 and k x1 ? x k  1 ?1 b (b + K2 k x0 ? x k +e1) k x0 ? x k  1 ?1 b (b + K2 r + e1 ) k x0 ? x k  1 ?1 b (b + K2  2(1 ?113bK? 2e1 ) + e1) k x0 ? x k  q k x 0 ? x k : Hence x1 2 S (x ; r). Let z = x1 + t(x0 ? x1 ). From Lemma 1 and the relation t B1 ? P0 (x) = B0 ? P0 (x ) + (t0 ?sTBs0 s0)s0 0 0 T 1 0 0  T = (B0 ? P0 (x ))(I ? ss0Tss0 ) + (P0 (x ) ? P0(sxT )s ? P(x )s0)s0 ;

we have k P0 (x )?1(B1 ? P0 (x )) k

 

0 0

k P0 (x )?1(B0 ? P0 (x)) k + k P0 (x )?1 +Pz0 (z) ? P0 (z) + P00 (z) ? Pz0 (z))dt k b + K (k x1 ? x k + k x0 ? x k)

0 0

Z1 0

(P0 (z) ? P0 (x )

2 e 2 + 2 (k x1 ? x k + k x0 ? x k) + e22 k x1 ? x0 k  b + (1 + q)( K2 + e2) k x0 ? x k  b + 23 ( K2 + e2)r  q1 < 1; 15

where we use (13K + 18e2)r  4(1 ? 3b ? 2e1) to derive the last inequality. Now, we suppose that (22) and (23) hold for all i (0  i  k) and prove them for k + 1. By the same technique as above, we obtain k xk+1 ? x k 1 ?1 q (q1 + K2 r + e1 ) k xk ? x k q k xk ? x k 1 and

k P0 (x)?1 (Bk+1 ? P0 (x)) k  k P0 (x )?1(Bk ? P0 (x )) k +(1 + q)( K2 + e2 ) k xk ? x k

kX ?1 K  b + (1 + q)( 2 + e2) qi k x0 ? x k i=0  b + 23 ( K2 + e2)r=(1 ? q)  q1 :

Hence (22) and (23) hold for all k. Now we prove the superlinear convergence. Let

Ek = P0 (x )?1(Bk ? P0 (x )): From Lemma 2, we have T

k Ek+1 kF  k Ek (I ? ssTk ssk ) kF + 32 ( K2 + e2 ) k xk ? x k k k  k Ek kF ? 2 k E1 k ( k kEsk skk k )2 + 23 ( K2 + e2 ) k xk ? x k : k F k Since we obtain It implies

1 3(K + e ) X k k x0 ? x k 3( K + e )r; q 2 2 2 2 2 k=1 1 kE s k X ( k k )2 < 1: k=1 k sk k

lim k kEsk skk k = 0: k

k!1

16

Furthermore, from Ek sk = P0 (x )?1(?F (xk ) ? P0 (x )(xk+1 ? xk )); we have the equality

Ek sk = P0 (x )?1(P (xk+1) ? P (xk ) ? P0 (x )(xk+1 ? xk ) This implies if and only if

?Pk (xk ) + P(xk ) ? P(xk+1)):

(24)

k P0 (x )?1(Pk (xk ) ? P(xk )) k = 0 lim k!1 k xk+1 ? xk k

(25)

k P0 (x )?1P(xk+1) k = 0: lim k!1 k xk+1 ? xk k

(26)

Suppose that (21) holds. Since

xk+1 ? xk = Bk?1(P (x) ? P(xk ) + P(xk ) ? Pk (xk )) Z1 = Bk?1(( P0 (x + t(xk ? x))dt ? P0 (x ) + P0 (x) ? Bk 0 +Bk )(xk ? x ) ? Pk (xk ) + P(xk )); we have

k xk+1 ? xk k  (1? k Bk?1P0 (x ) kk P0 (x )?1(P0 (x ) ? Bk ) k) k xk ? x k ? K2 k xk ? x k2 ? k P0 (x)?1 (Pk (xk ) ? P(xk )) k  12 k xk ? x k ?o(k xk ? x k) = O(k xk ? x k);

where we used the inequality q1 < 1=3: Hence (25) and (26) hold. By Lemma 4 and P(x ) = F (x ), we obtain that fxk g converges superlinearly to x . Conversely, assume that fxk g converges superlinearly to x . Then from Lemma 4, we obtain (26). This implies (25) holds, i.e.,

k P0 (x )?1(P(xk ) ? P(x )) k= o(k xk+1 ? xk k): From Lemma 3, we conclude that (21) holds.

2 17

5. Numerical experiments In this section, we give computational results for four numerical examples. Example 1 is the Kojima-Shindo four-variable problem. We rst solve this problem by methods (3), (6), (9), (10), (11) and (13). Next we solve this problem by these methods with a line search. Example 2 is the HansenKoopmans problem. Example 3 is the Nash-Cournot production problem. Example 4 is the Walrasian production-price problem. We solve the three problems by the method (13) with a line search. Let (x) = 12 F (x)T F (x):

Damped Algorithm Let ;  2 (0; 1) and initial vector x0 2 Rn. For k  0:

1. Solve for dk Vk d = ?F (xk ); or F (xk ) + G(xk ; d) = 0; or (Vk + k I )d = ?F (xk ); or Bk d = ?F (xk ), Bk de ned in (9) or (13). 2. Let mk be the smallest nonnegative integer m such that

(xk + m dk ) ? (xk )  ?2m (xk ): Set xk+1 = xk + mk dk . To solve the equations (12) by the iteration function method (6), we need the particular de nition of the iteration function G(; ): The function G(; ) for (12) was de ned in [10] as:

8 > if gi(x) < fi(x); fi(x)  0; < 5gi(x)TT d if gi(x) > fi(x); gi(x)  0; Gi(x; d) = > 5fi(x) d : min(5giT d; 5fiT d) otherwise.

To solve (12) by the generalized Jacobian based Newton method (3), we determine an element Vx 2 @B F (x) by the following method (see [24]): 18

Let

I0 = fi : 1  i  n; gi(x) = fi(x); 5gi(x) 6= 5fi(x)g: F is di erentiable at x if and only if I0 = ;. If I0 = ;, then Vx = 5F (x) is determined. Otherwise, let I0 = fi1 ; i2; :::; im g; (m  1) and let 1 0 5 fi (x) ? 5gi (x) CA 2 Rmn : ::::::: Q = (qi;k ) = B @ 5fim (x) ? 5gim (x) Let Ik = fi : 1  i  m; qi;k 6= 0g; k = 1; 2; :::; n: Then I0 = Sni=1 Ii : Now we pick a vector z such that 1

1

(Qz)i 6= 0; i = 1; 2; :::; m: Let z1 = e1 = (1; 0;S0; :::; 0)T 2 Rn. For k  1 : If I0 = ki=1 Ii; then let z = zk . Otherwise consider Ik+1: S k If Ik+1  i=1 Ii; then let zk+1 = zk : Otherwise let zk+1 = zk + k Q k?1k ek+1: Using such z, we de ne row (Vx)i of Vx as: (Vx)i = 5fi(x); if fi(x) < gi(x) or if fi(x) = gi(x) and 5fi(x) = 5gi(x) or if fi(x) = gi(x) and 5fi(x)T z < 5gi(x)T z: (Vx)i = 5gi(x); Otherwise.

Example 1. We consider the following degenerate nonlinear complementarity problem [15]: x  0; f (x)  0; xT f (x) = 0; x 2 R4 where f : R4 ! R4 is given by

0 3x2 + 2x x + 2x2 + x + 3x ? 6 1 BB 2x121 + x1 1+ 2x22 + 102 x3 +3 2x4 ?4 2 CC f (x) = B @ 3x21 + x1 x2 + 2x22 + 2x3 + 9x4 ? 9 CA : x21 + 3x22 + 2x3 + 3x4 ? 3 19

This problem has two solutions

p

x = (1; 0; 3; 0) and x = ( 6=2; 0; 0; 0:5) ' (1:224745; 0; 0; 0:5): F (x) is di erentiable at the rst solution but nondi erentiable at the second one. This problem is solved by using the method (3) which is called \V", the method (6) which is called \G", the method (9) which is called \Split", the method (10) which is called \ B", the method (11) which is called \Param" and the method (13) which is called \Appro". These methods with line search are called \RV", \RG", \RB", \RParam", \RAppro" and \RSplit", respectively. First, we notice that \V", \RV", \G" and \RG" fail in several instances due to the singularity of Vx or the unsolvability of the equation in (6). For initial value x^ = (0; 0; 1; 0);

00 0 1 31 B 0 1 0 0 CC B@ 0 0 2 9 CA Vx^ = F 0(^x) = B 0 0 2 3

and G(^x; d) = Vx^ d = F 0(^x)d. There is no solution of F (^x) + G(^x; d) = 0. For initial value x = (0; 0; 0; 1);

00 0 1 31 B 0 1 0 0 CC Vx = F 0(x) = B B@ 0 0 1 0 CA 0 0 2 3

and there is no solution d of the equation:

1 0 ?3 + d + 3d 3 4 B 0 + min(d2; d1 + 10d3 + 2d4) CC CA = 0: F (x) + G(x; d) = B B @ 0 + min(d3; 2d3 + 9d4) 0 + 2d3 + 3d4

2.

The computational results for Example 1 are shown in Table 1 and Table 20

Table 1: The iteration number k and error k xk ? x k1 (or k xk ? x k1 ). Initial V Data (1,0,0,0) 4(x ) 2  10?7 (0,1,0,0) failed (0,0,1,0) failed (0,0,0,1) failed (0,0,0,0) failed (1,1,1,1) failed (1,0,1,0) 2(x) 0.0 (1,0,0,1) 5(x ) 2  10?7

G

Param

B

4(x ) 15(x) 93(x) 2  10?7 3  10?7 57  10?7 failed failed 27(x) 8  10?7 failed 16(x) 29(x) 7  10?7 3  10?7 failed 16(x) 36(x) 5  10?7 24  10?7 failed 19(x) 43(x) 6  10?7 4  10?7 failed 17(x) 32(x) 6  10?7 5  10?7 2(x) 16(x) 440(x) 0.0 5  10?7 12  10?7  5(x ) 14(x) 33(x) 2  10?7 7  10?7 3  10?7

Split 39(x) 13  10?7 29(x) 5  10?7 58(x) 7  10?7 31(x) 12  10?7 99(x) 4  10?7 28(x) 1  10?7 140(x) 23  10?7 29(x) 2  10?7

1. The splitting of F is de ned by (20). 2. The PBSA function P is de ned in Proposition 3. 3. = 0:625;  = 1:0 and  = 0:625 (or  = 0:0625 at ?). 4. Stopping criterion : k dk k 10?6.

21

Appro 31(x) 4  10?7 24(x) 3  10?7 24(x) 13  10?7 98(x) 16  10?7 43(x )? 16  10?7 51(x )? 56  10?7 29(x) 6  10?7 53(x) 1  10?7

Table 2: The iteration number k and k xk ? x k1 (or k xk ? x k1). Damped Algorithm Initial RV Data (1,0,0,0) 20(x) 8  10?7 (0,1,0,0) failed

RG

RParam

25(x ) 20(x ) 11  10?7 5  10?7 failed failed

RB

RSplit

33(x) 33(x) 3  10?7 3  10?7 35(x) 35(x) 5  10?7 5  10?7 (0,0,1,0) failed failed 23(x ) 23(x) 23(x) ? 7 ? 7 11  10 3  10 3  10?7 (0,0,0,1) failed failed 21(x ) 43(x ) 43(x ) 5  10?7 14  10?7 14  10?7 (0,0,0,0) failed failed 32(x ) 28(x) 28(x) 5  10?7 2  10?7 2  10?7 (1,1,1,1) 23(x) 23(x ) 23(x ) 26(x) 26(x) ? 7 ? 7 ? 7 ? 7 52  10 52  10 52  10 12  10 12  10?7 (1,0,1,0) 21(x) 21(x ) 21(x ) 20(x) 20(x) 1210?7 1210?7 12  10?7 5  10?7 5  10?7 (1,0,0,1) 18(x) 18(x ) 19(x ) 74(x ) 74(x ) 7  10?7 7  10?7 2  10?7 9  10?7 9  10?7 1. The splitting of F is de ned by (20). 2. The PBSA function P is de ned in Proposition 3. 3.  = 0:7;  = 0:8; = 1:0;  = 1:0,  = 0:625 in RSplit and  = 0:0625 (or = 0.00625 at ?) in RAppro. 5. Stopping criterion : k dk k 10?6.

22

RAppro 33(x ) 3  10?7 349(x)? 45  10?7 23(x ) 3  10?7 40(x ) 3  10?7 28(x ) 2  10?7 26(x ) 12  10?7 27(x ) 2  10?7 109(x)? 25  10?7

Remarks on Tables 1 and 2 1. As the results in Table 1 show, we successfully nd a solution by the Broyden method (10), the Broyden-like method with the PBSA function (13) and the Broyden-like method with the splitting (9). However the solution is the rst solution x in all the cases. The second solution x is closer to the initial point (1,0,0,1) than x , but the three methods converge to x from (1,0,0,1). We guess that this situation arises because F is not di erentiable at x . However, as the results in Table 2 show, we can nd the second solution by the three methods with a line search. 2. We chose k , k and  as constants for comparison. For application of these methods, one may choose di erent k ; k and k at each step.

Example 2. In [11], Hansen and Koopmans considered an economy with three consumption goods, two capital goods, and two resources other than capital goods (e.g. skilled and unskilled labor). This problem can be formulated as a NCP [22] with the functions g(x; y; z) = (x; y; z) and f : R+n  Rm  Rl ! Rn  Rm  Rl de ned by 10 1 1 0 0 0 C T AT ? B T ? 5 v(x) CA B@ xy CA 0 f (x; y; z) = B @ w CA + B@ ?C 0 z B?A 0 0 0 where ! 2 : 2 : 2 : 2 : 2 : 2 : 2 : 2 : 2 : 2 : A = 3: 3: 2: 2: 1: 1: 1: :5 1: :5

B = 12::57 C = 01::5 and

1:5 1:5 1:5 1:5 2:7 1:8 1:8 0:9 1: 1: 1: 1: 1:5 1:5 0:5 0:5 w = (0:8; 0:8);

1:5 4: 3: 0:9 0:9 :4 1: 1: 1: 1:5 1:5 :5 = 0:7

1:5 2: 1: :5

v(x) = (x1 + 2:5x2 )0:2(2:5x3 + x4 )0:2(2x5 + 3x6)0:2: 23

!

1:5 1:5 ! 1: 1:5

i i Li i

1 5. 10. 1.20

2 3. 10. 1.00

3 8. 10. 0.90

4 5. 10. 0.60

5 1. 10. 1.50

6 3. 10. 1.00

7 7. 10. 0.70

8 4. 10. 1.10

9 10 6. 3. 10. 10. 0.95 0.75

We solve this problem by the method (13) with a line search. The computational results are shown in Figure 1 with the starting point (x; y; z) = (0:5; :::; 0:5; 0; 0; 0; 0) suggested in [22].

Example 3. A Nash-Cournot equilibrium problem given in [12] was formulated [22] as a NCP with the functions g(x) = x and f : R+n ! Rn de ned by

X X f (x) = c0 (x ) ? p( x ) ? x p0( x ); i

where

i i

n

i=1

i

n

i

i=1

i

i = 1; 2; :::; n

ci(xi ) = ixi + 1 + i L1i = xi( i +1)= i ; i

p(Q) = 5000 Q?1= with Q = Pni=1 xi and = 1:2. We solve this problem by the method (13) with a line search. The computational results with an initial point x0i = 1:1; i = 1; 2; :::; 10 are shown in Figure 1. 1=

24

Example 4. \Perhaps the most interesting application - from an economic point of view will be to the general Walrasian model" [30]. This example considers the Walrasian production-price problem [30]. Let  be the demand function de ned by Xl j () = aij bii() ; j = 1; 2; :::; m i=1 j m where  2 R is the price vector, for i = 1; :::; l, bi is the i-th consumer's elasticity of substitution, ai;j measures the ith consumer's intensity of demand for commodity j , and

i() = (

m X k=1

k !i;k )=(

m X k=1

ai;k k1?bi )

is a function of the price vector selected so that the budget constraint is satis ed for each individual. Let B 2 Rmn describe the technology and w 2 Rm describe the total resource endowment of the economy prior to production. Let y be the production plan (activity levels). Then this mold can be formulated as m  Rn and an NCP [22] de ned by the functions g(; y) = (; y) 2 R++ m n m n f : R++  R ! R  R where

! ! ?  (  ) + By f (; y) = : ?B T 

Using the method (13) with a line search, we solve a problem of (m; n; l) = (6; 8; 5), which is given in Chapter 5 in [30]. We choose di erent starting points around an approximate solution x~ =(0.22032, 0.25107, 0.16102, 0.05494, 0.10608, 0.20658, 0.4635, 0.0, 3.9392, 0.0060, 0.0, 0.0 0.4383, 0.0). Let e^ = (1; 1; 1; 1; 1; 1; 0; :::; 0) 2 R14 and e = (0; 0; 0; 0; 0; 0; 1; :::; 1) 2 R14 . The results are summarized in Table 3.

25

Table 3:

c0 2:0  10?1 2:0  10?1 2:0  10?6 2:0  100 1:0  100 c1 2:0  10?1 0.0 2:0  10?1 2:0  100 2:0  10?2 k 52 9 111 262 33 Initial Data x0 = (0 ; y0) = x~ + c0 e^ + c1e: The iteration number k which satis es jjF (xk )jj  10?6:

Acknowledgements We are thankful to two referees and Dr. R. Womersley for their comments.

REFERENCES 1. X. Chen, On the convergence of Broyden-like methods for nonlinear equations with nondi erentiable terms, Ann. Inst. Statist. Math. 42(1990) 387-401. 2. X. Chen, M.Z. Nashed and L. Qi, Convergence of Newton's method for singular smooth and nonsmooth equations using adaptive outer inverses, Applied Mathematics Preprint AM93/4, School of Mathematics, The University of New South Wales(Sydney, Australia 1993). 3. X. Chen and T. Yamamoto, On the convergence of some quasi -Newton methods for nonlinear equations with nondi erentiable operators, Computing 48(1992) 87-94. 4. X. Chen and T. Yamamoto, Newton-like methods for solving underdetermined nonlinear equations, to appear in : J. of Comp. App. Math. 5. F.H. Clarke, Optimization and Nonsmooth Analysis, John Wiley and Sons, New York, 1983. 6. J.E. Dennis, Jr and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall, Inc., 1983. 26

7. M. Ferris and S. Lucidi, Globally convergence methods for nonlinear equations, Computer Sciences Technical Report ] 1030, Computer Sciences Department, University of Wisconsin, (Madison, USA 1991). 8. M. Fukushima, Equivalent di erentiable optimization problems and descent methods for asymmetric variational inequality problems, Math. Programming 53(1992) 99-110. 9. S.A. Gabriel and J.S. Pang, A trust region method for constrained nonsmooth equations, to appear in: W.W. Hager, D.W. Hearn and P.M. Pardalos, ed., Large Scale Optimization: State of the Art, Kluwer Academic Publishers B.V. 10. S.P. Han, J.S. Pang and N. Rangaraj, Globally convergent Newton methods for nonsmooth equations, Math. Oper. Res. 17(1992) 586607. 11. T. Hansen and T.C. Koopmans, On the de nition and computation of capital stock invariant under optimization, J. Economic Theory 5(1972) 487-523. 12. P.T. Harker, Accelerating the convergence of the diagonalization and projection algorithms for nite-dimensional variational inequalities, Math. Programming 41(1988) 25-59. 13. M. Heinkenschloss, C.T. Kelley and H.T. Tran, Fast algorithms for nonsmooth compact xed point problems, SIAM J. Numer. Anal. 29(1992) 1769-1792. 14. C.M. Ip and J. Kyparisis, Local convergence of quasi -Newton methods for B-di erentiable equations, Math. Programming 56(1992) 71-89. 15. M. Kojima and S. Shindo, Extensions of Newton and quasi-Newton methods to systems of PC 1 equations, J. Oper. Res. Soc. of Japan, 29(1986) 352-374. 16. B. Kummer, Newton's method for non-di erentiable functions, in J.Guddat, B.Bank, H.Hollatz, P.Kall, D.Klatte, B.Kummer, K. Lommatzsch, L.Tammer, M.Vlach and K.Zimmerman, eds., Advances in Mathematical Optimization, Akademi-Verlag, Berlin, (1988) 114-125. 27

17. J.M. Martnez and M.C. Zambaldi, Least change update methods for nonlinear systems with nondi erentiable terms, Numer. Funct. Anal. and Optimiz. 14(1993) 405-415. 18. M.Z. Nashed and X. Chen, Convergence of Newton-like methods for singular operator equations using outer inverses, to appear in Numer. Math. 66(1993). 19. J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. 20. J.S. Pang, Newton's method for B-di erentiable equations, Math. of Oper. Res. 15(1990) 311-341. 21. J.S. Pang, A B-di erentiable equation-based, globally and locally quadratically convergent algorithm for nonlinear programs, complementarity and variational inequality problems, Math. Programming 51(1991) 101-131. 22. J.S. Pang and S.A. Gabriel, NE/SQP: A robust algorithm for the nonlinear complementarity problem, Math. Programming 60(1993) 295337. 23. J.S. Pang and L. Qi, Nonsmooth equations: motivation and algorithms, SIAM J. Optimization 3(1993) 443-465. 24. L. Qi, Convergence analysis of some algorithms for solving nonsmooth equations, Math. Oper. Res. 18(1993) 227-244. 25. L. Qi, Trust region algorithms for solving nonsmooth equations, Applied Mathematics Preprint, AM92/20, The University of New South Wales(Sydney, Australia, 1992). 26. L. Qi and X. Chen, A globally convergent successive approximation method for nonsmooth equations, AM92/22, The University of New South Wales(Sydney, Australia, 1992). 27. L. Qi and J. Sun, A nonsmooth version of Newton's method, Math. Programming 58(1993) 353-367. 28

28. D. Ralph, Global convergence of damped Newton's method for nonsmooth equations via the path search, to appear in : Math. Oper. Res. 29. S.M. Robinson, Newton's method for a class of nonsmooth functions, Industrial Engineering Working Paper, University of Wisconsin, (Madison, USA 1988). 30. H. Scarf, The Computation of Economic Equilibria, Yale University Press(1973). 31. A. Shapiro, On concepts of directional di erentiability. J. Optimiz. Theory. Appl. 66(1990) 477-487. 32. T. Yamamoto, A note on a posteriori error bound of Zabrejko and Nguen for Zincenko's iteration, Numer. Funct. Anal. and Optimiz. 9(1987) 987-994. 33. T. Yamamoto and X. Chen, Ball-convergence theorems and error estimates for certain iterative methods for nonlinear equations, Japan J. Appl. Math. 7(1990) 131-143. 34. A.I. Zincenko, Some approximate methods of solving equations with nondi erentiable operators (Ukrainian), Dopovidi Akad. Nauk. Ukrain. RSR(1963) 156-161.

29