Cn x 7! x ?Df(x) kxi ? k2 is - CiteSeerX

Report 1 Downloads 30 Views
ON GENERALIZED NEWTON ALGORITHMS : QUADRATIC CONVERGENCE, PATH-FOLLOWING AND ERROR ANALYSIS GREGORIO MALAJOVICH

November 10,1993 Abstract. Newton iteration is known (under some precise conditions) to con-

verge quadratically to zeros of non-degenerate systems of polynomials. This and other properties may be used to obtain theorems on the global complexity of solving systems of polynomial equations (See Shub and Smale in [6]), using a model of computability over the reals. However, it is not practical (and not desirable) to actually compute Newton iteration exactly. In this paper, approximate Newton iteration is investigated for several generalizations of the Newton operator. Quadratic covergence theorems and a robustness theorem are extended to approximate Newton iteration, generalizing some of the results in [6]. The results here can be used to prove complexity theorems on path following algorithms for solving systems of polynomial equations, using a model of computation over the integers (Malajovich [3])

1. Introduction The classical Newton operator for approximating a zero of a system of n polynomial equations f = (f1 ; : : :; fn) in n variables is given by the formula : N a (f ) : C n ! C n x 7! x ? Df (x)?1 f (x) It is well-known that once a point x0 is near enough of a non-degenerate zero  of f (i.e., a zero  of f with rank Df ( ) = n), the sequence (xi ), de ned by xi+1 = N a (f )(xi ), converges quadratically to  . This means that the error kxi ?  k2 is, roughly speaking, at most squared at each iteration. Therefore, given such x0, in order to obtain an approximation of  with precision , it suces to perform O(log ? log ) iterations of the Newton operator N a . A rigorous analysis of Newton iteration appears in Smale [10] . Criterions based solely on the knowledge of f and x are given in order to guarantee quadratic convergence properties. Key words and phrases. Newton iteration, path-following, alpha-theory, complexity. Sponsored by CNPq (Brasil). This paper was nished while the author was visiting CRM (Institut d'Estudis Catalans).

1

2

GREGORIO MALAJOVICH

Newton iteration can be used to approximate zeros of systems of equations by means of path-following (see Morgan [4]) . Let ft be a family of systems of n equations of degree d = (d1; : : :; dn) in n variables. Let's assume that ft depends smoothly on t, and is parametrized by a multiple of arc length. Let z0 be a good approximation of a zero of f0. Then, generically, a zero of f1 may be approximated through the following recurrence : zi+1 N a (f iN )(zi ) provided that the number N of homotopy steps is big enough. Shub and Smale gave in [6] an explicit N in terms of max di, the length of the curve ft , and the condition number (ft ) (to be de ned below). Those results can be used to obtain theorems on the global complexity of approximating (in a very precise sense) the zeros of a system of polynomial equations (Shub and Smale [7, 8, 9]) using a model of complexity over the real numbers, where it is assumed that real number computations can be performed exactly and at unit cost (See Blum, Shub and Smale [2]) . If one wishes to obtain complexity bounds under a more traditional model of computation (BSS over Z, or Turing) one can perform all operations exactly using rationals. Unfortunately, coecient size (number of bits) may be multiplied by a constant at each iteration, so that the cost of approximating a zero would likely be exponential in the number of iterations. The subject of this paper is an alternative, more practical approach. It turns out that theorems on quadratic convergence (e.g. Theorems 1, 3 and 5 below) and on path-following (e.g. Main Theorem of [6]) can be extended to approximate Newton iteration. Namely :  If Newton iteration is computed with error  , under certain conditions, we obtain a sequence (zi ) approximating a zero  with error bounded by   i ?1 ? 2 max 2 ; 6 (Theorems 2, 4 and 6 below).  The path-following problem can be solved within : 100 length(ft ) (ft )2 (max di) homotopy steps with accuracy : 1 10(max di ) (ft ) (Theorem 10). +1

3 2

3 2

(The precise statement of the Theorems is left for section 2, since it requires some aditional de nitions). Thus, nite precision arithmetic may be used to obtain good approximation of zeros of polynomials through Newton iteration. Constructing an approximate Newton operator with error  is not dicult. It is possible to bound its complexity in terms of  and (ft ) (see Malajovich [3]). If we are given a system of n polynomial equations in n variables (x1 ; : : :; xn) , it is convenient to homogenize the system by introducing a new variable x0, so that

ON GENERALIZED NEWTON ALGORITHMS...

3

each monomial in fi will have degree di = deg fi : each monomial Y xj Ij 1j n

is replaced by

x 0 di ?

P

jn Ij

1

Y

1j n

xj Ij

If (x1; : : :; xn) is a zero of the original system, then (1 : x1 :    : xn) is a zero of the homogeneous system. If (z : x1 :    : xn ) is a zero of the homogeneous system, and if z 6= 0, then ( xz ; : : :; xzn ) is a zero of the original system. If z = 0, we say that (0 : x1 :    : xn ) is a zero at in nity. Zeros at in nity are important for the complexity analysis of the classical Newton iteration ([6]), and can be treated as ordinary zeros for more generalized versions of Newton iteration. The whole theory becomes simpler by considering the projectivizations of the space of polynomial systems and of the space C n+1 of the roots, and the writing all the formulas in an unitary invariant form, following [6]. Therefore, Df (x) is a (n +1)  n matrix, and we have to de ne more general versions of Newton iteration. 1

2. Definitions and main results. Let Hd be the space of all systems of n homogeneous polynomials in n + 1 complex variables, of degree d = (d1 ; d2; : : :; dn), with complex coecients. Let D = max di, and assume D  2. The space Hqd is endowed with the unitarily P kfi kk 2 , where : invariant Kostlan norm k:kk , de ned by kf kk =

kfi kk =

v u X u  u t

jJ j=di

v

jfiJ j2 = u jfiJ j2  u X  t ! di jJ j=di J !Jd!i:::J n! J 0

1

The Kostlan norm k:kk induces a metric dproj (f; g) = min kfkg?kgk in Hd . A zero of f 2 Hd is a point  2 C Pn such that f ( ) = 0. Alternatively, it is a line through the origin in C n+1 such that f ( ) = 0 for all  in that line. Let f 2 Hd , and x range over C n+1 . Df (x) is a linear operator from C n+1 into n C . A generalized Newton operator is de ned by the mapping : NV : x 7! x ? Df (x)jV (x) ?1 f (x) k

k

where V is a smooth family of hyperplanes in C n+1 , V (ax) = aV (x), and each V (x) contains the point x. V (x) will inherit the metric of C n+1 . The notation Df (x)jV (x) ?1f (x) represents a point of V (x), as contained in C n+1 . Di erent choices of V lead to di erent versions of Newton iteration, as we will see. To each generalized Newton operator, we can associate a few invariants. Those are functions of Hd  C n+1 , and are invariant under the group generated by the following transformations : Unitary : (f; x) 7! (f  U ?1; Ux) ; U 2 U (n + 1)

4

GREGORIO MALAJOVICH

Scaling :

(f1 ; : : :; fn; x) 7! (a1 f1 ; : : :; anfn ; bx) ; ai ; b 2 C ? Scaling invariance implies that those invariants can be considered as functions of P(Hd)  C Pn. We de ne :

(f; z ) = (z ) = kz1k :

Df (z )jV (z) ?1 f (z )

2 2( ) 

 k? 1

? 1 k

(f; z ) = (z ) = max 1; kz k2 max Df (z )jV (z) D f (z ) 2 k2 k! (f; z ) = (z ) = (z ) (z ) 1

1

Here, we have always  1. Also, Dk f (z ) is a multilinear operator from C n+1 k into C n . Therefore, those de nitions are slightly di erent from the ones in [6], where Dk f (z ) is restricted to what we call V (x)k . Invariance of , and follows from the de nitions. ?



The Newton operator in ane space : If we set :

V (x) = x + f0; y1 ;    ; yn g we obtain the (classical) Newton operator in ane space N a . If (z0 ) is small enough, successive iterates of z0 will converge quadratically to a zero of f . The following theorem is essentially the Quadratic Convergence Theorem by Shub and Smale in [6] : Theorem 1. Let f 2 Hd Let z0 2 C n+1 have its rst coordinate non-zero. Let a (z0 ) < 1=8. Let the sequence (zi ) be ide ned by zi+1 = N a (zi ). Then there is a zero  of f such that dproj(zi ;  )  2?2 ?1 Above, distance in projective space is measured by :  kx ? y k2  dproj(x; y) = min 2C kxk2 There is a robust form of this theorem, that incorporates some error in each iteration. Since N a (f; zi ) scales in kzi k2 , it makes sense to measure the error at each iteration by :

zi+1 ? N a (f; zi ) 2 kzi k2 Indeed, we will prove : Theorem 2. Let f 2 Hd , let z0 2 C n+1 , let the rst coordinate of z0 be non-zero, and let   0 verify : ( a (f; z0 ) +  ) a (f; z0 ) < 1=16, and a (f; z0 ) < 1=384. Let the sequence (zi ), where the rst coordinates of zi and z0 are equal, verify :

zi+1 ? N a (f; zi ) 2  kzi k2

ON GENERALIZED NEWTON ALGORITHMS...

5

Then there is a zero  of f such that :

  dproj(zi ;  )  max 2?2i ?1; 6

Another version of the ane Newton operator was constructed by Morgan [4] by xing a random vector y, and setting V (x) = x + y? . This random change of coordinates allows him to use the classical Newton operator (in ane space) with systems that have zeros at in nity. There are more general Newton operators that allow to approximate zeros at in nity. The Newton operator in projective space : If we de ne : V (x) = x + x? we obtain the Newton operator in projective space, which may also be described by: ?1    f (x) N proj(x) = x ? Dfx(*x) 0 where x* means complex transpose of x. This operator was de ned by Shub [5]. We will prove the Theorems : Theorem 3. Let f 2 Hd and let z0 2 C n+1 be such that proj(z0 ) < 1=32. Let the sequence (zi ) be de ned by zi+1 = N proj(zi ). Then there is a zero  of f such that i dproj(zi ;  )  2?2 ?1 Theorem 4. Let f 2 Hd, z0 2 C n+1 and assume that   0 veri es : ( proj(f; z0 ) +  ) proj (f; z0 ) < 1=32; and proj (f; z0 ) < 1=640. Let the sequence (zi ) verify :

zi+1 ? N proj(f; zi ) 2  kzi k2 Then there is a zero  of f such that :   dproj(zi ;  )  max 2?2i?1 ; 10

Pseudo Newton operator If we make :

V (x) = x + ker Df (x)? we obtain the pseudo-Newton operator : N pseu (x) = x ? Df (x)y f (x)

where Ay is the Moore-Penrose pseudo-inverse of A, de ned by : Ay = Aj(kerA)? ?1

6

GREGORIO MALAJOVICH

This notation refers to the case rankDf (x) = n. In the case rankDf (x) < n, the operator N pseu is not de ned. An equivalent de nition in our case is the following : In the particular case the matrix to invert is diagonal, we set : 3 2 3y 2 1 ?1 1 7 6 6 7 2 2 ?1 7 6 7 =6 5 4 4 5   ? 1 n 0 n 0 Then we extend this de nition to all matrices of rank n by setting, for any U , V unitary: (U V )y = V *yU * A very important property of the pseudo-inverse of A : C n+1 ! C n is that Ayy is the vector of minimal norm in the linear space A?1 y. Hence,

y

A = min AjV ?1 2 2 when V ranges over all hyperplanes through the origin. This Newton operator was suggested by Allgower and Georg [1]. We will prove the Theorems : Theorem 5. Let f 2 Hd and let z0 2 C n+1 be such that pseu(z0 ) < 1=8. Let the sequence (zi ) be de ned by zi+1 = N pseu(zi ). Then there is a zero  of f such that i dproj(zi ;  )  2?2 ?1 Theorem 6. Let f 2 Hd, z0 2 C n+1 and let   0 verify : ( pseu(f; z0 ) +  ) pseu (f; z0 ) < 1=16, and pseu (f; z0 ) < 1=384. Let the sequence (zi ) verify : kzi+1 ? N pseu (f; zi )k2   kzi k2 Then there is a zero  of f such that :   dproj(zi ;  )  max 2?2i ?1; 6

Path-following and conditioning :

The robustness results in [6] come out naturally in the generalized case. We can de ne some more invariants associated to a generalized Newton operator :

o n p (f; x) = max 1; kf kk

Df (x)jV (x) ?1 diag( dikxkdi ?1)

2

(f; x) =



diag(

di?1 kxk2 ?di )f (x)

2 kf kk

Invariants  and  are invariants under unitary transformations and under scalings of the form (f; x) 7! (af; bx), a; b 2 C ? . The following estimates relate  and  to and : (f; x)  (f; x)(f; x) 3=2

(f; x)  (f; x2)D

ON GENERALIZED NEWTON ALGORITHMS...

7

The rst estimate is obvious. The second one follows from the same proof as in Shub and Smale [6], III-1 and (in the case = 1) from the fact 1  D3=2=2 when D  2. Also, as in [6], the following estimates are true :

Lemma 1. (g;  )  dproj(f; g) + (f;  ) (g;  )  (f;p )(1 + dproj(f; g)) 1 ? Ddproj(f; g)(f;  ) The number of steps and precision necessary for following a path (ft ; t) will depend on the following Theorems, that are modi ed versions of Theorem 3 in [6], I-3 : Theorem 7. There are  = 0:02, u = 0:05 such that, if   1 and : a (f;  )a (f;  )   dproj(x;  )  u

a (f;  )   Then setting x0 = N a (f; x), and  0 the zero associated to x0 , we get : u dproj(x0;  0 )  2

Theorem 8. There are  = 0:01, u = 0:005 such that, if   1 and : proj(f;  )proj(f;  )  

dproj(x;  )  u

proj(f;  )   Then setting x0 = N proj(f; x), and  0 the zero associated to x0, we get : u dproj(x0;  0 )  2

Theorem 9. There are  = 0:02, u = 0:05 such that, if   1 and : pseu (f;  )pseu (f;  )   dproj(x;  )  u

pseu (f;  )  

8

GREGORIO MALAJOVICH

Then setting x0 = N pseu (f; x), and  0 the zero associated to x0 , we get :

u dproj(x0;  0 )  2

It is immediate that :

Corollary 1. In each of the three cases N = N a ; N proj; N pseu, there are , u such that, if   1 and : (f;  )(f;  )   dproj(x;  )  u

(f;  )   u   2

x0 ?N (f;x) Then setting x0 such that k kxk k   , and if  0 is the zero associated to x0, we get : 2

2

dproj(x0 ;  0)  u

A generalization of the Main Theorem of Shub and Smale in [6] for approximate Newton iteration follows : Theorem 10. Assume that N = N a , N proj or N pseu. Let  and u be given by Theorems 7, 8 or 9, respectively. Let (ft ; t) be a path in Hd  C n+1 , so that ft (t ) = 0. Let   max((ft ; t)) be nite. Let   32 D3=2 . Let z0 verify dproj(z0 ; 0)  u . Let (ti ) be a sequence such that dproj(fti ; fti )    83    . Let (zi ) verify :

zi+1 ? N (ft ; zi )  i 2  u kzi k2 2 Then dproj(zi ; ti )  u , and hence (fti ; zi)  . = In particular, if the length of the path ft is bounded by L, then 169 L D steps of approximate Newton iteration with error less than 49 D u=  suces to follow the path (ft ; t) and obtain a zero of f1 . +1

+1

2

3 2

3 2

3. Estimates on Let the generalized Newton operator N be one of N a , N proj or N pseu . Let zi be a sequence of points satisfying : kzi+1 ? N (f; zi )k2   kzi k2 for some   0. In the ane case N = N a , assume furthermore that V (zi ) = V (zi+1 ). This follows from the hypothesis of Theorem 2 according to which the rst coordinates of all the zi are equal.

ON GENERALIZED NEWTON ALGORITHMS...

9

The case  = 0 represents the exact iteration zi+1 = N (f; zi ). For notational convenience, we will write : i = (f; zi )

i = (f; zi ) i = (f; zi ) = i i ui = kzi+1kz ?k zi k2 i i 2 ~ i = ( i +  ) i (u) = 1 + 2u2 ? 4u The following bounds are obvious, since i  1 : kzi+1 k2  1 + u (1) i kzi k2

kzi k2 1 kzi+1 k2  1 ? ui

(2)

Let p(z ) be the projection of C n+1 into the n-plane V (z ), in the direction of ker Df (z ). (We assume that Df (z ) has rank n). Let p(z 0 ; z ) be the restriction of p(z ) to V (z 0 ). Let  be a constant,   kp(zi ; zi+1)k2 for all i. In the cases N = N proj and N = N pseu , we require the stronger condition   kp(zi+1 )k2 . If we are using N = N a , we have V (zi ) = V (zi+1 ), hence can take  = 1. If we are using N = N pseu , then by construction we have that V (zi ) ? ker Df (zi ). It follows that we can also take  = 1. Later on, we will bound  in the case N = N proj . The idea of the proof of the quadratic convergence theorem will be to show that, under certain circumstances, ~ i+1  4~ i2 We start with :

Lemma 2. Under the conditions above, 2 i+1 ? 2 kzkzi k2k (1 ?(uui))    kkz zi k2k 1 ?(uu)i i ( i +  )2 i+1 2 i i+1 2 i Proof of Lemma 2

In order to prove Lemma 2, we break i+1 as follows :

i+1 = kz 1 k

Df (zi+1 )jV (zi ) ?1f (zi+1 )

2 (3) i+1 2

 kz 1 k

Df (zi+1 )jV (zi ) ?1Df (zi+1 )jV (zi )

2 +1



i+1

2

+1



Df (zi+1 )jV (zi ) ?1Df (zi )jV (zi )

2

Df (zi )jV (zi ) ?1f (zi+1 )

2

10

GREGORIO MALAJOVICH

Part 1 : Df (zi+1 )jV (zi

?1 Df (z

i+1 )jV (zi ) is the projection p(zi ; zi+1 ) from V (zi ) into V (zi+1 ) in the direction ker Df (zi+1 ). It follows that its norm is bounded by :



?1 (4)

Df (zi+1 )jV (zi ) Df (zi+1 )jV (zi )   2 +1

)

+1

Part 2 : We rst write : Df (zi )jV (zi )

?1 Df (z

Df (zi )jV (zi ) ?1 Dk f (zi ) (zi+1 ? zi )k?1 k i+1 )jV (zi ) = I + k ! k2 X

We obtain the inequality :



Df (zi )jV (zi ) ?1Df (zi+1 )jV (zi ) ? I

2 

X

k2

kuik?1  (1 ?1u )2 ? 1 i

It follows that :



(5)

Df (zi+1 )jV (zi ) ?1Df (zi )jV (zi )

2  2 ? 1 1

(1?ui)2

 (1 ?(uui))

2

i

Part 3 : We expand :

Df (zi )jV (zi ) ?1f (zi+1 ) = Df (zi )jV (zi ) ?1 f (zi ) + Df (zi )jV (zi ) ?1 Df (zi )(zi+1 ? zi ) X Df (zi )jV (z ) ?1 Dk f (zi ) i + (zi+1 ? zi )k k ! k2

Since (by hypothesis) zi+1 cannot be at distance more than kzi k2  of zi ? Df (zi )jV (zi ) ?1 f (zi ), the projection of zi+1 into V (zi ) cannot be at distance more than kzi k2  of the projection of zi ? Df (zi )jV (zi ) ?1 f (zi ). Thus :



Df (zi )jV (zi ) ?1f (zi ) + Df (zi )jV (zi ) ?1Df (zi )(zi+1 ? zi )

2   kzi k2

For the terms of order  2, we have :



X

k2



k?1 X  kzi+1 ? zi k

Df (zi )jV (zi ) ?1 Dk f (zi ) k 2

(zi+1 ? zi )  kzi+1 ? zi k2 k! kzi k2 i

k  2 2  1 ?uiu kzi+1 ? zi k2 i u i  1 ? u ( i +  )kzi k2 i Hence, we obtain :

ui

?1 (6)

Df (zi )jV (zi ) f (zi+1 )  1 ? ui ( i +  )kzi k2 +  kzi k2 2

Putting all together :

ON GENERALIZED NEWTON ALGORITHMS...

11

Inserting bounds (4), (5) and (6) into inequality (3), we get : 2 i+1  kzkzi k2k  1 ?(uu)i i ( i +  )2 + kzkzi k2k 2 (1 ?(uui))  i+1 2 i i+1 2 i Hence,  2 (1 ? u ) k z k i i 2 2 i+1 ? kz k  (u )   kzkzi kk2  1 ?(uu)i i ( i +  )2 i+1 2 i i+1 2 i This proves Lemma 2. 4. Estimates on n +1 Lemma 3. Let zi 2 C verify kp(zi ; zi+1)k2  , let i = (f; zi ) and ui = kzi ?zi k . Then we have : i kzi k

i+1   kzkiz+1kk2 (u )(11 ? u ) i i 2 i i +1

2

2

Note that in the statement above, we do not require kzi k?ziNk(zi )k   . +1

2

2

Proof of Lemma 3 : We rst estimate



Df (zi+1 )jV (z ) ?1 Dk f (zi+1 ) i+1

k!

2

.

According to the estimates (4) and (5), we have :



?1 ?1

Df (z i+1 )jV (zi ) Dk f (zi+1 )

  (1 ? ui )2

Df (zi )jV (zi ) Dk f (zi+1 )





k! (ui )

k! 2 2 Moreover,



?1 ?1

Df (z ) i jV (zi ) Dk f (zi+1 )

X

Df (zi )jV (zi ) Dk+l f (zi )

l



kzi+1 ? zi k2



k! k ! l ! l0 2 2 X k + l!( i )k?1ui l  k?1 l0 k!l!kzi k2 k?1 X l  i k?1 k +k!ll!!ui kzi k2 l0

i k?1  (1 ? ui)k+1 kzi k2k?1 Thus,

?1

Df (z

i k?1 i+1 )jV (zi ) Dk f (zi+1 )

 



k! (ui )(1 ? ui )k?1kzi k2k?1 2 Using (ui )  1,   1 and extracting the (k ? 1)-th root, we obtain :

i+1   kzkiz+1kk2 (u )(11 ? u ) i i 2 i i This proves Lemma 3. +1

+1

12

GREGORIO MALAJOVICH

5. Estimates on In this section, we prove Theorems 1, 2, 5 and 6. Combining Lemma 2, equation (2) and Lemma 3, we obtain the following result : Lemma 4. Under the hypotheses and notations of Lemma 2,   i+1 ? 2 1 ?(uu)i  i+1  2 (u1 )2 ( i +  )2 i 2 i i

Proof of theorems 1 and 5 : If we make  = 0, Lemma 4 reads:

2 i+1  (u )2 i2 i Assume that we are in the hypotheses of theorems 1 or 5. Then  = 1. Also, ui = i. Assume by induction that i  1=8, we obtain (ui ) > 0:531 > 1=2 and equation (7) implies : i+1  4 i2  1=16  1=8 i By induction, i  2?2 ?2 and hence : i  i  2?2i?2 X dproj (zi ;  )  i  2 i  2?2i?1

(7)

j i

This proves theorems 1 and 5. Lemma 4 allows us to prove the following statement : Lemma 5. Assuming the hypotheses of Lemma 2, and using the same notation, let kzi k?ziNk(zi )k   ,  > 0, ( 0 +  ) 0  1=8, and suppose that for  6= 0, 0  i < j +1

we have :

2

2

1 +  (1(u?iu) i )  i+1  4 (ui )  ?1 2

(8)

2

2

f.

where the denominator is positive. Then ( i+1 +  ) i+1  4(( i +  ) i )2 , and hence ( j +  ) j  2?2j ?2 . This also implies j  2?2j ?2, and dproj(zj ;  )  2?2j ?1, where  is a zero of

Proof of Lemma 5 : Equation (8) is the same as : 



2 i+1 +   4 (u2i ) i+1 ? 2 1 ?(uu)i  i Plugging this formula in Lemma 4, we obtain : ( i+1 +  ) i+1  4(( i +  ) i )2 This proves lemma 5. Lemma 5 means that in the conditions of Theorems 2 , 4 and 6, as long as  is small enough relatively to , we have quadratic convergence. We still have to prove

ON GENERALIZED NEWTON ALGORITHMS...

13

that as soon as we are no more in the conditions of Lemma 5, the sequence zi gets

trapped in a disk of radius 6 over  . 4 Lemma 6. If equation ( 8) is not true, and ui  1=16, then i+1  28 15  

Proof of Lemma 6 :

4 2 1 +  (1(u?iu) i)   1 9+ ?3 1   28 i+1 < 4 (ui) 4  15 ? 1 4   Lemma 7. Let  be a zero of f . Let the disk D of center  and radius 2k , k  3, verify for each z 2 D the condition (f; z )  ?, with (k + 1)? < 0:1. Let (zi ) < k , zi 2 D. Then (zi+1 )  ( k6 + 22) . In particular, if  = 1, (zi+1 )  k and zi+1 2 D. Proof of Lemma 7 : According to Lemma 2, 2 i+1   kzkzi k2k 1 ?(uu)i i ( i +  )2 + 2 kzkzi kk2 (1 ?(uui))  i+1 2 i i+1 2 i Using equation (2), 2 i+1  (u ) ?((k + 1) )2 +  (1(u? )ui )  i i   2 2  (((kk++1)1) ?) ? + ((k + 1)? )  2  + 2  (k ((+k1)+ ?1)? )  Using (k + 1)? < 0:1, we get ((k + 1)? ) > 0:6, hence : :1 + 2    1 k + 11 2     k + 22  i+1  (k + 1)0 0:6 6 6 6 This proves lemma 7. 2

2

2

2

Lemma 8. Let u = kzk?zizkik i  161 . Then (z)  1:52 i. 2

For the proof, we use Lemma 3 and equation (1), according to which :

(z )  (u(1)(1+?u)u) i If u  1=16, then (u) > 3=4 and : 17

(z )  43   15  i  1:52 i 2

Proof of theorems 2 and 6 :

Assume the hypotheses of Theorem 2 (resp. of Theorem 6). Let k = 3. Let us x j such that i  k for i  j and j +1  k . Let D be the disk of radius 2k over  .

14

GREGORIO MALAJOVICH

Assuming that i < 81 , Lemma 2 implies that dproj (zi ;  )  2 i . Indeed, by applying Lemma 2 to the exact Newton iteration starting from zi , one would obtain : (9) i+1  ( ) i i < 2i i Therefore,  is at distance at most 2 j of zj , and all points in D are within distance 4 j of zj . We have to consider several cases : General case : j  1. In that case, j  4 02  1=64. If z 2 D is scaled properly, kzk?zjzkj k j  4 j  1=16. Therefore, we can apply Lemma 8 and obtain

(z )  1:52 j . We x ?  2 j . We also have (k + 1)? < 2 k+1 k j  3 j < 0:1. Hence, we apply Lemma 7 by induction, and conclude that zj +1 ; zj +2; : : : belong to D. Special cases : j = 0 and j does not exist (this means that 0 < 3 ). The case j = 0 is the more dicult, so we prove only this case. The proof of the other case is similar. We claim that ? = 4 0 veri es maxD  2 1  4 0 = ?. Indeed, u0 < 1=16, hence by Lemma 8, 1  2 0 . The distance from z1 to any point of D is bounded by 4k , and 4k 1  8k 0 < 1=16. We can use Lemma 8 again, and conclude that maxD (z )  2 1  4 0 . Thus, we can set ? = 4 0 . In order to use Lemma 7, we have to check that (k + 1)? < 0:1. This amounts to check that 4(k +1) 0  < 0:1. This follows from the hypothesis on 0  . Thus, we use Lemma 7 by induction, and conclude that zj +k 2 D. Theorems 2 and 6 are now proved. In order to prove Theorems 3 and 4 we still need to be able to bound . 2

2

6. Estimates on  We prove here Theorems 3 and 4. Let zi be a sequence such that :

zi+1 ? N proj (zi ) 2   ;  0 kzi k2 proj proj Assume that ~proj i = ( i +  ) i   0,  0 a constant no more than 1=32. pseu proj proj Since (zi )  (zi )  ~ (zi )   0, it follows from equation (9) that there is a zero  at distance of zi no more than 2 pseu (zi )  2 0= pseu (zi )  1=16 pseu (zi ).

Lemma 9. In the conditions above, kp(zi+1 )k2  r

1?



1

 9:12 0 2 (4:56 0)

ON GENERALIZED NEWTON ALGORITHMS...

15

Proof of Lemma 9 : Let us scale  so that k k2 = 1. Also, we can scale zi so that zi 2  +  ? . Let us choose y 2 ker Df (zi+1 ). By similarity of triangles, q

kp(zi+1 )k2 = 1 ? dproj (y; zi+1 )2 We now estimate dproj(y; zi+1 ). We set : v = kzi+1k k?  k2 pseu ( ) 2

?1

We apply Lemma 8 to obtain :

pseu ( )  1:52 pseu (zi )  1:52 proj(zi ) Also, kzi+1 ?  k2  kzi+1 ? zi k2 + kzi ?  k2  kzi k2 ( +  + 2 )  3( +  ) i i k k2 k k2 k k2 k k2 i Hence, v = kzi+1k k?  k2 pseu ( )  4:56 0 2



We can scale y so that we can write y =  + y? , y? ?  , then dproj (y;  )  y? 2 . By the choice of y, Df (zi+1 )y = Df (zi+1 )( + y? ) = 0 Expanding Df around  , we obtain :   k X ?  Df ( )( + y? ) + k D kf!( ) (zi+1 ?  )k?1  + y? = 0 k2 Obviously, Df = 0 ; we apply Df (zi+1 )y to the equation, and obtain : X  Df ( )y Dk f ( )  ? k?1 ? + y?  = 0 y + k ( z i +1 ?  ) k! k2 Now, we have :



X

k2

k

Thus,

Df ( )y Dk f ( )  k!

(zi+1 ? 



k ? 1 )

2

1

dproj (y;  )  y? 2  2(1??v)



2



k ( )k?1 (zi+1 ?  )k?1

X

 kvk?1  (1 ?1 v)2 ? 1

?1

1

X

(1?v)2

 2 ?(v)v v  2(vv)

16

GREGORIO MALAJOVICH

Putting all together, s



:56 0 kp(zi+1 )k2  1 ? 2 (4:456 0 )

2

?1

and Lemma 9 is proved. Proof of Theorem 3 We check numerically (using Lemma 9) that for  0 = 1=32, we have   1:26. Also, ( )  2:06  4, so equation 7 gives : 2

0

2

i+1  4 i2  1=32 And this proves theorem 3. Proof of Theorem 4 : Using   1:2567, Lemmas 6, 7 and 8 become :

Lemma 10. If equation ( 8) is not true, and ui  1=16, then i+1  4:66 < 5 Lemma 11. Let  be a zero of f . Let the disk D of center  and radius 2k , k  4, verify for each z 2 D the condition (f; z )  ?, with (k + 1)? < 0:1. Let (zi ) < k , zi 2 D. Then (zi+1 )  ( k6 + 22) . In particular, if   1:2567, then (zi+1 )  k and zi+1 2 D. Lemma 12. Let u = kzk?zizkik i  161 . Then (z)  1:52 i  1:92 < 2. 2

2

At this time, we set k = 5. The same proof of Theorems 2 and 6 applies word by word to prove Theorem 4. 7. Proof of the Robustness results

Proof of Lemma 1 :

The rst estimate is easy. The second follows from :

(g;  ) = (g;  )



?1

2 ?1

))jVf ( )

= kgkk

diag(di ?1=2k k21?di )D(g( ))jVg ( )



 kgkk

diag(di ?1=2k k21?di )D(g(

2

Then we proceed as in [6], using Lemma 5 of III-1. Bounds on (f; x) : Let us put ourselves in the conditions of Theorems 7 , 8 or 9. By hypothesis, kxk? k k  u. Also, we assume that u < 1=16. The following estimate is very similar to Lemma 2 : 2

2

ON GENERALIZED NEWTON ALGORITHMS...

Lemma 13.

17

(f; x)  (1 ? u) (1 ? u) (f;  )(u+) dproj(x;  ) kkx kk2

2

Proof of Lemma 13 : Using the fact that kxk?kk ( )  u, we can write, using 2

Parts 1 and 2 of the proof of Lemma 2 :

2 (f; x)   (1 ?(uu)) kx1k

Df ( )jV ( ) ?1f (x)

2 2 Expanding the last term into its Taylor series, we obtain : 2







Df ( )jV ( ) ?1 f (x)

2 

Df ( )jV ( ) ?1 f ( ) + Df ( )jV ( ) ?1 Df ( )(x ?  )

2

Therefore,



X +

k2



Df ( )jV ( ) ?1 Dk f ( ) k ? 1

( x ?  )

k!

2 u   k k2 ( (f;  ) + dproj (x;  )) + k k2 1 ? u dproj(x;  )

? u) + u)dproj (x;  ) (f; x)   kkx kk2 (1 ? u) (1 ? u) (f;  ) + ((1 ( u) 2   kkx kk2 (1 ? u) (1 ? u) (f;  )(u+) dproj (x;  ) 2 This proves Lemma 13 Lemma 3 gives : (f;  ) kxk2

(f; x)   (u )(1 ? u) k k2 Using Lemma 13 together with the previous estimate, we obtain :

Lemma 14.

(f;  ) + u (f; x)  2 (1 ? u) ( u )2

Now we use Lemma 2 and obtain :

Lemma 15.

(f; x0 )   kkxx0kk2 1 ?( (f;(f;xx))) (f; x) (f; x) 2

Proof of Theorems 7 and 9 :

We rst set  = 1. Let us assume for a while that :

(10)

(1 ? u) + u < 1=32 (u)2

18

GREGORIO MALAJOVICH

It follows from Lemma 14 that (f; x) < 1=8 , and Lemma 15 implies : (f; x0 )  8 (11 =8) (f; x) Hence : + u 1 (f; x0 )  8 (11 =8) (1 ? u) (1 ? u() u) 

Hence, in order to obtain (f; x0 )  u=2 , we need : 1 (1 ? u) + u < u (11) 8 (1=8) (1 ? u) (u) 2 Numerically, we can verify that : u = 0:05 and  = 0:02 make conditions (10) and (11) true, proving Theorems 7 and 9.

Proof of Theorem 8 :

Let us assume now that :

(12)

) + u < 1=32 2 (1 ? u( u )2

It follows from Lemma 14 that (f; x) < 1=32 , and from Lemma 15 we obtain : (f; x0 )   32 (11 =32) (f; x) Hence : + u 1 (f; x0 )   32 (11 =32) (1 ? u) (1 ? u() u) 

Hence, in order to obtain (f; x0 )  u=2 , we need : + u u  32 (11 =32) (1 ? u) (1 ? u() (13) u) < 2 If we further assume  < 1=32, we have always  < 1:2567. Numerically, we can verify that : u = 0:005 and  = 0:01 make conditions (12) and (13) true, proving Theorem 8.

Proof of Theorem 10 : We rst set :

and :

  32 D3=2 9    38     16 2 D3=2

We assume by induction that dproj(zi?1 ; ti? )  u . We want to verify that we are in the conditions of Corollary 1. p Using   0:02, we obtain the estimates :   0:04 and D  0:04. 1

ON GENERALIZED NEWTON ALGORITHMS...

We use Lemma 1 :

19

3=2

(fti ; ti? )  D2 (fti ; ti? ) 3=2  D2 (fti? ; ti? ) 1 p+  1 ? D  1 : 04  1:92 D3=2   23 D3=2    (fti ; ti? )(fti ; ti? )   1 p+  1 ? D 3 1 : 04   8 0:96

    Now we can apply Corollary 1, and conclude that dproj (zi ; ti )  u  . This proves Theorem 10. 1

1

1

1

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

1

1

References Eugene L. Allgower and Kurt Georg Continuation and Path-Following. Preprint, Colorado State University, may 1992. Acta Numerica (1992) 1-64 Lenore Blum, Mike Shub and Steve Smale, On a theory of computation and complexity over the real numbers : NP-completeness, recursive functions and universal machines. Bulletin of the AMS, 21, 1, July 1989. Gregorio Malajovich On the complexity of path-following Newton algorithms for solving systems of polynomial equations with integer coecients. PhD Thesis, Berkeley, 1993. United Micro lms Inc, 300 North Zeeb Road, Ann Arbor, MI, 48106-1346 USA, Phone 800-521-0600. Alexander Morgan Solving polynomial systems using continuation for engineering and scienti c purposes. Prentice Hall Inc, Englewood Cli s, New Jersey, 1987. Michael Shub, Some remarks on Bezout's Theorem and complexity theory. in M. Hirsch, J. Marsden and M. Shub (ed), From Topology to Computation, Proceedings of the Smalefest, 1993. Michael Shub and Steve Smale, On the Complexity of Bezout's Theorem I - Geometricaspects. Journal of the AMS, 6, 2, Apr 1993. Michael Shub and Steve Smale, On the complexity of Bezout's Theorem II - Volumes and Probabilities. in: F. Eysette and A. Galligo, eds : Computational Algebraic geometry. Progress in Mathematics 109, Birkhauser, 267-285, 1993. Michael Shub and Steve Smale, Complexity of Bezout's Theorem III ; Condition number and packing. Journal of Complexity 9, 4-14, 1993. Michael Shub and Steve Smale, Complexity of Bezout's Theorem IV ; Probability of success ; Extensions Preprint, Berkeley, 1993. Steve Smale, Newton method estimates from data at one point, in R. Erwing, K. Gross and C. Martin (editors). The merging of disciplines : New directions in Pure, Applied and Computational Mathematics. Springer, New York, 1986

Departamento de Matematica Aplicada da UFRJ, Caixa Postal 68530, CEP 21945, Rio de Janeiro, RJ, BRASIL

E-mail address : [email protected]