Spectral method and its application to the ... - Semantic Scholar

Report 4 Downloads 129 Views
Applied Mathematics and Computation 240 (2014) 339–347

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Spectral method and its application to the conjugate gradient method q Dongyi Liu a,b,⇑, Liping Zhang a, Genqi Xu a,b a b

Department of Mathematics, Tianjin University, Tianjin 300072, PR China Center for Applied Mathematics of Tianjin University, Tianjin 300072, PR China

a r t i c l e

i n f o

a b s t r a c t A new method used to prove the global convergence of the nonlinear conjugate gradient methods, the spectral method, is presented in this paper, and it is applied to a new conjugate gradient algorithm with sufficiently descent property. By analyzing the descent property, several concrete forms of this algorithm are suggested. Under standard Wolfe line searches, the global convergence of the new algorithm is proven for nonconvex functions. Preliminary numerical results for a set of 720 unconstrained optimization test problems verify the performance of the algorithm and show that the new algorithm is competitive with CG_DESCENT algorithm. Ó 2014 Elsevier Inc. All rights reserved.

Keywords: Conjugate gradient method Descent property Spectral analysis Global convergence

1. Introduction and main idea The classical nonlinear conjugate gradient (NCG) method with line searches is as follows:

xkþ1 ¼ xk þ ak dk

ð1Þ

and

d0 ¼ g 0 ;

dkþ1 ¼ g kþ1 þ bk dk ;

8k P 0

ð2Þ

where g k ¼ gðxk Þ ¼ rf ðxk Þ, which is a well-known method for the large-scale unconstrained optimization problem

minff ðxÞ : x 2 Rn g: To guarantee the global convergence properties of NCG methods, the fundamental assumptions about the objective function f ðxÞ are H1. f is bounded below in Rn and continuously differentiable in a neighborhood N of the level set L ¼ fx : f ðxÞ 6 f ðx0 Þg, where x0 is the starting point of the iteration. H2. The gradient of f is Lipschitz continuous in N , that is, there exists a constant L > 0 such that

krf ðxÞ  rf ðxÞk 6 Lkx  xk;

8x; x 2 N :

q The work of the authors research is supported by the Natural Science Foundation of China Grant NSFC-61174080 and by the Seed Foundation of Tianjin University. ⇑ Corresponding author at: Department of Mathematics, Tianjin University, Tianjin 300072, PR China. E-mail addresses: [email protected] (D. Liu), [email protected] (L. Zhang), [email protected] (G. Xu).

http://dx.doi.org/10.1016/j.amc.2013.12.094 0096-3003/Ó 2014 Elsevier Inc. All rights reserved.

340

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

And the standard Wolfe line search strategy [20,21] usually is needed as follows: T

f ðxk þ ak dk Þ 6 f ðxk Þ þ c1 ak dk g k

ð3Þ

and T

T

dk gðxk þ ak dk Þ P c2 dk g k

ð4Þ

where 0 < c1 < c2 < 1. Of course other type line searches are also often used, such as the strong Wolfe line search, Goldstein type line-search and Armijo-type line search and so on [8,13,24]. In addition, the descent property [1] T

dk g k < 0;

ð5Þ

or the sufficiently descent property [11] T

dk g k 6 c0 kg k k2 with c0 > 0

ð6Þ

is also a necessary condition for the global convergence. The Zoutendijk condition [25] is another important condition often used to prove the global convergence of NCG methods, such as [5,8]. Under the condition that the level set L is bounded, the Zoutendijk condition is utilized in [3,18,19,23,24]. Gilbert and Nocedal [11] introduced the so-called property () and proved the convergence of the modified PRP method (PRP+). Later on, their results were generalized by Dai et al. [6]. The similar idea was also used in [7,9,12,22] to proved the global convergence under the assumption that the level set L is bounded. These algorithms mentioned above focused on the update for bk in (2) to guarantee that the corresponding algorithms converge globally. In this paper, we reformulate the line search directions of NCG methods as follows:

d0 ¼ g 0 ;

dk ¼ Mk g k ;

8k P 1

ð7Þ

where M k is called the conjugate gradient iteration matrix. We develop new algorithms by selecting the suitable iteration matrix M k and prove the global convergence by estimating the eigenvalues of M Tk M k . So, this proof method is called the spectral method. In what follows, we introduce in the spectral condition theorem for an objective function satisfying H1 and H2, which generalizes Theorem 4.1 in [15]. Theorem 1.1. Assume the objective function f ðxÞ satisfy H1 and H2. For a NCG method determined by (1) and (7) which satisfies the sufficiently descent condition (6) and implements the standard Wolfe line searchs (3) and (4), if 1 X ðKk Þ1 ¼ þ1

ð8Þ

k¼1

where Kk is the maximum eigenvalue of M Tk M k , then either g k ¼ 0 for some k P 1, or

lim inf kg k k ¼ 0:

ð9Þ

k!1

e where K e is a positive constant, then Moreover, if Kk 6 K

lim kg k k ¼ 0:

ð10Þ

k!1

Proof. Assume that g k – 0; 8k P 1 and lim inf k!1 kg k k – 0, then there exists c > 0 such that kg k k > c; 8k P 1, and the sufficiently descent condition (6) implies that dk – 0; 8k P 1. It follows from (7) that

kdk k2 ¼ g Tk M Tk Mk g k 6 Kk kg k k2 :

ð11Þ

Thus, according to (6) and the above inequality, it can be deduced that T

cos2 hk ¼

2

ðdk g k Þ 2

2

kdk k kg k k

P c20

kg k k2 2

kdk k

P

c20 Kk

where hk is the angle between dk and g k . Thus, 1 X X c20 kg k k2 cos2 hk P c ¼ 1; Kk kP1 k¼1

which contradicts to the Zoutendijk’s condition (see also Theorem 2.1 in [11]), 2 1 X X ðg Tk dk Þ kg k k2 cos2 hk ¼ < 1: 2 kP1 k¼1 kdk k

ð12Þ

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

Therefore, (8) implies that either g k ¼ 0 for some k P 1, or (9) holds. e ; cos h P c0 = K e > 0, which together with (12) implies (10). If K 6 K k

k

341

h

It can be concluded from Theorem 1.1 that for any conjugate gradient algorithm with sufficiently descent property, the conjugate gradient algorithm is global convergence, if the maximum eigenvalue of M Tk M k is bounded above. 2. Application of the spectral method In [16], the Perry conjugate gradient method is generalized according to Perry’s idea [17], and a new Perry descent conjugate gradient algorithm (PDCGy) is presented, that is, the line search directions are formulated by

d0 ¼ g 0 ;

dkþ1 ¼ g kþ1 þ

yTk g kþ1 T dk yk

T

dk  c

dk g kþ1 T

dk yk

yk ;

8k P 0

ð13Þ

where c is a preset parameter. Thus, the corresponding iteration matrix is defined by

Pkþ1 ¼ I 

dk yTk T dk yk

T

þc

yk dk

T dk yk

¼I

sk yTk y sT þ c Tk k : T sk yk sk y k

ð14Þ

To establish the global convergence of the PDCGy algorithm for nonconvex functions, firstly, we restrict the value of yTk sk as follows:

( s k

g ¼

if kg k k2 P gak kdk k2 ;

yTk sk ;

ð15Þ

ksk k2 ; otherwise;

or equivalently, the value of yTk dk as follows:

(

d k

g ¼

if kg k k2 P gak kdk k2 ;

yTk dk ;

ð16Þ

ak kdk k2 ; otherwise

where g > 0. Then the iteration matrix (14) is updated by T T b kþ1 ¼ I þ cyk sk  sk yk : P s

ð17Þ

gk

Thus we can introduce new line search directions as follows: T

d0 ¼ g 0 ;

T

yk g kþ1 d g b kþ1 g dkþ1 ¼  P dk  c k dkþ1 yk ; kþ1 ¼ g kþ1 þ d

gk

gk

8k P 0:

ð18Þ

Proposition 2.1 in following subsection shows that the directions defined by (18) satisfy the sufficient descent property, if

qffiffiffiffiffiffiffi 1 qffiffiffiffiffiffiffi 1 b k 6 c 6 1 þ ð1  dÞ x bk 1  ð1  dÞ x bk ¼ where d 2 ð0; 1 and x

yTk yk dTk dk 2

ðgdk Þ

. Moreover, Proposition 2.3 and Theorem 2.5 show that the spectrum of the matrix

ð19Þ PTkþ1 P kþ1

bT P b may be unbounded above, but the spectrum of the matrix P kþ1 kþ1 is bounded under the conditions (4) and (19). So the algorithm determined by the scheme (1) and (18) with (19) is called the Perry descent conjugate gradient algorithm with restricted spectrum, denoted by RSPDCGy. 2.1. Descent property and spectral analysis We first prove the descent property in this subsection. Proposition 2.1. For each a constant d 2 ð0; 1, if c satisfies (19), then the directions defined by (18) satisfy the sufficient descent property T

dkþ1 g kþ1 6 dkg kþ1 k2 :

ð20Þ

Proof. First, it follows from (18) that T

T

dkþ1 g kþ1 ¼ kg kþ1 k2 þ ð1  cÞ

yTk g kþ1 dk g kþ1

gdk

:

ð21Þ

342

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

pffiffiffiffiffiffiffi b k . Now, we assume that b > 0 and consider the followLet b ¼ j1  cj, so, it is derived from (19) that b ¼ j1  cj 6 ð1  dÞ= x ing quadratic equation

b k k2  2ð1  dÞk þ b ¼ 0: bx

pffiffiffi D

2

b k b Þ  0 and 1  d > 0, the quadratic equation has a positive solution a ¼ 2ð1dÞþ Since its discriminant D ¼ 4ðð1  dÞ2  x 2b b xk Next, from (21) and the following inequality T T ðdk g kþ1 yk Þ ðg kþ1 dk Þ

g

.

! 2 ðgdk Þ kg kþ1 k2 1 2 2 T ; 6 aðg kþ1 dk Þ kyk k þ 2 a

it can be followed that T dkþ1 g kþ1

b

T T ðdk g kþ1 yk Þ ðg kþ1 d 2 kÞ

2

¼ kg kþ1 k 

ðg

2

d kÞ

g 6 kg kþ1 k þ

abkdk k2 kyk k2 2

2ðgdk Þ

  bk ab x b kg kþ1 k2 ¼ dkg kþ1 k2 : ¼ 1  2a 2

! b kg kþ1 k2 þ 2a

T

When b ¼ 0, then (21) implies that dkþ1 g kþ1 ¼ kg kþ1 k2 6 dkg kþ1 k2 . h Remark 2.2. When gsk  yTk sk , Proposition 2.1 means that the PDCGy algorithm also satisfies the sufficiently descent property (20) under the following condition:

pffiffiffiffiffiffiffi 1 pffiffiffiffiffiffiffi 1 1  ð1  dÞ xk 6 c 6 1 þ ð1  dÞ xk ; where xk ¼

yTk yk sTk sk 2

ðsT yk Þ

ð22Þ

. When c  1 in (13), the PDCGy algorithm is denoted by PDy1.

k

bT P b Next, we analyze the eigenvalues of the matrix P kþ1 kþ1 defined by (17). max bT P b Proposition 2.3. Denote the maximum eigenvalue and minimum eigenvalue of P and kmin k , respectively. Assume kþ1 kþ1 by kk s T that c satisfies (19) and kyk k 6 Lksk k for k ¼ 1; 2; . . ., where L > 0. Then, when gk ¼ sk yk ,

kmax ¼ maxf1; c2 gxk and kmin ¼ minf1; c2 xk g; k k where xk ¼

yTk yk sTk sk 2

ðsT yk Þ

ð23Þ

 1, and

k

2

e 1 L þ ð1  dÞÞ ; as jyTk sk j P g e ksk k2 ; kmax 6 ðg k

ð24Þ

e > 0 and d 2 ð0; 1; when gsk ¼ ksk k2 , where g 2

kmax 6 2 þ 2ð1  dÞ þ ð2L2 þ ð1  dÞÞ : k

ð25Þ

Proof. From (17), it can be derived that

b kþ1 Þ ¼ 1 þ ðc  1Þ detð P

bT P b P kþ1 kþ1 ¼





sTk yk

gsk

þ

csk yTk  yk sTk

g

s k

  2 c ksk k2 kyk k2  ðsTk yk Þ ðgsk Þ2

ð26Þ

;

  cy sT  sk yTk ðc  1Þðsk yTk þ yk sTk Þ c2 yTk yk sk sTk  cyTk sk ðsk yTk þ yk sTk Þ þ sTk sk yk yTk Iþ k k s ¼Iþ þ gk gsk ðgsk Þ2

and 2

2ðc  1ÞsTk yk ðc2 þ 1Þksk k2 kyk k2  2cðsTk yk Þ bT P b traceð P þ : kþ1 kþ1 Þ ¼ n þ gsk ðgsk Þ2

ð27Þ

bT P b So, the real number 1 is an eigenvalue of P kþ1 kþ1 with the geometric multiplicity n  2, and by the relationship between the trace and eigenvalues of matrix, (26) and (27), the other two eigenvalues l1 and l2 satisfy

8 2 2ðc1ÞsTk yk ðc2 þ1Þksk k2 kyk k2 2cðsTk yk Þ > > þ ; > < l1 þ l2 ¼ 2 þ gsk ðgs Þ2 k   2 2 > sT yk cðksk k2 kyk k2 ðsTk yk Þ Þ > > : : l1 l2 ¼ 1 þ ðc  1Þ kgs þ ðgs Þ2 k

k

ð28Þ

343

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

Thus, when gsk ¼ sTk yk , then

l1 þ l2 ¼ ðc2 þ 1Þxk and l1 l2 ¼ c2 x2k ; which implies that

l1 ¼ c2 xk ; l2 ¼ xk . So, (23) holds, and when jyTk sk j P ge ksk k2 ,

pffiffiffiffiffiffiffi 1 2 pffiffiffiffiffiffiffi 2 2 e 1 L þ ð1  dÞÞ : kmax 6 ð1 þ ð1  dÞ xk Þ xk 6 ð xk þ ð1  dÞÞ 6 ð g k When gsk ¼ ksk k2 , then by the inequality kyk k 6 Lksk k, it is deduced that T

bk ¼ x

yTk yk dk dk d 2 kÞ

ðg

¼

yTk yk sTk sk s 2 kÞ

ðg

yTk yk 6 L2 : sTk sk

¼

Thus, it follows from (19) and the first identity in (28) that

l1 þ l2 6 2 þ

2jðc  1ÞsTk yk j

gsk

þ

ð1 þ jcjÞ2 ksk k2 kyk k2 ðgsk Þ2

2ð1  dÞjsT y j 6 2 þ pffiffiffiffiffiffiffi k k þ 2 þ ð1  dÞ b k gsk x

qffiffiffiffiffiffiffi 1 !2 bk bk x x

2

b k þ ð1  dÞÞ ; 6 2 þ 2ð1  dÞ þ ð2 x which implies that (25) holds.

h

b kþ1 ¼ P kþ1 , the iteration matrix of the PDCGy algorithm. Thus, the maximum eigenvalue Remark 2.4. When gsk  sTk yk , then P   2 kmax ¼ max 1; c x , which means that it may be unbounded above. k k From above discussion, we present three concrete forms of the RSPDCGy algorithm. When c  1, it follows from (18) that

d0 ¼ g 0 ;

dkþ1 ¼ g kþ1 þ

yTk g kþ1 d k

g

T

dk 

dk g kþ1

gdk

yk ;

k P 0;

ð29Þ

the corresponding algorithm is denoted by RSPDy1. When c satisfies T

c ¼ 1; as

yTk g kþ1 dk g kþ1

gdk

T

yT g d g 1d P 0 and c ¼ 1  pffiffiffiffiffiffiffi ; as k kþ1 d k kþ1 < 0; gk bk x

ð30Þ

the corresponding algorithm (the scheme (1), (18) and (30)) is denoted by RSPDy2. When c satisfies T

T

yT g d g yT g d g 1d 1d c ¼ 1 þ pffiffiffiffiffiffiffi ; as k kþ1 d k kþ1 P 0 and c ¼ 1  pffiffiffiffiffiffiffi ; as k kþ1 d k kþ1 < 0; gk gk bk bk x x

ð31Þ

the corresponding algorithm (the scheme (1), (18) and (31)) is denoted by RSPDy3. 2.2. Convergence We first prove the global convergence of RSPDCGy algorithms for nonconvex functions. Theorem 2.5. Assume that H1 and H2 hold. If the RSPDCGy algorithm implements the standard Wolfe line searches (3) and (4), then g k ¼ 0 for some k P 1, or, limk!1 kg k k ¼ 0. max bT P b b kþ1 is the iteration matrix of the RSPDCGy algorithm Proof. Denote the maximum eigenvalues of P , where P kþ1 kþ1 by kk defined by (17). Assume that g k – 0 for all k P 1. When kg k k2 P gak kdk k2 , then gsk ¼ yTk sk , and it follows from (4) and (20) in Proposition 2.1 that

T

yTk sk P ak ð1  c2 Þdk g k P ak dð1  c2 Þkg k k2 P gdð1  c2 Þksk k2 :

ð32Þ

Thus, it can be derived from H2, (19), (24) and (25) in Proposition 2.3 that

0 < kmax 6 max k

(

L þ1 gdð1  c2 Þ

2

2

; 4 þ ð2L2 þ 1Þ

Therefore, Theorem 1.1 claims limk!1 kg k k ¼ 0.

) :

h

Next, we show the convergence of the PDCGy algorithm for general functions, as limk!1 kyk k ¼ 0. Theorem 2.6. Assume that H1 and H2 hold. For the PDCGy algorithm, the line research directions, dkþ1 , are calculated by (13) and c satisfies the condition (22). If it implements the standard Wolfe line searches (3) and (4), then limk!1 kyk k ¼ 0 implies that lim inf k!1 kg k k ¼ 0.

344

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

Proof. Let kmax be the maximum eigenvalue of P Tkþ1 P kþ1 . From Proposition 2.3 and (22), it follows that k

 pffiffiffiffiffiffiffi 1 2 kmax ¼ maxf1; c2 gxk 6 1 þ ð1  dÞ xk xk 6 4xk : k The Wolfe condition (4) and Remark 2.2 of Proposition 2.1 yield

xk ¼

yTk yk sTk sk 2 ðsTk yk Þ

yTk yk sTk sk

6

2

ð1  c2 Þ

2 ðsTk g k Þ

yTk yk g k PTk Pk g k

¼

ð1  c2 Þ

2

2 T ðdk g k Þ

6 c23

2 kmax k1 kyk k

kg k k2

ð33Þ

;

where c3 ¼ ð1c1 2 Þd. Thus, 2 kmax 6 kmax k k1 ð2c 3 Þ

kyj k kg j k

kyk k2 kg k k

2

6    6 kmax 0

k Y

ð2c3 Þ2

j¼1

kyj k2 kg j k2

:

Now assume that limk!1 kyk k ¼ 0 and lim inf k!1 kg k k ¼ e > 0. Then there exists a positive integer N 0 such that 6 ð2c3 Þ1 for j > N 0 , thus,

kmax 6 kmax k 0 where C N ¼ kmax 0

2 N0 k Y kyj k2 Y 2 kyj k ð2c3 Þ2 ð2c Þ 6 CN ; 3 kg j k2 j¼N0 þ1 kg j k2 j¼1

QN0

2 2 kyj k j¼1 ð2c3 Þ kg j k2 .

8k > N 0 ;

Theorem 1.1 claims that limk!1 kg k k ¼ 0, which contradicts the assumption, so

limk!1 kyk k ¼ 0 implies that lim inf k!1 kg k k ¼ 0.

h

2.3. Numerical experiment In this subsection, we demonstrate the algorithms: PDCGy (i.e., PDy1) and RSPDCGy (i.e., RSPDy1, RSPDy2 and RSPDy3). We suggest the g ¼ 0:001 in (15) and (16), and d ¼ 0:9 in (19) for the RSPDCGy algorithm. The test problems are 73 unconstrained problems except the 71-st, coded by N. Andrei, referring to website: http://camo.ici.ro/forum/SCALCG/. These test problems come from the unconstrained problems in the CUTE [4] library, and other large-scale optimization problems [2]. Each function is tested 10 times with the number of variable 1000; 2000; . . . ; 10; 000, respectively. So we have a set of 720 unconstrained optimization test problems. The starting points used are those given in the code, evalfg.for. For comparison with the CG_DESCENT algorithm [14], All codes are written in Fortran 77 according to the CG_DESCENT code, which can be get from Hager’s web page at http://www.math.ufl.edu/hager/. The termination criterion of all algorithms is that kgk1 < 106 , where k  k1 is infinity norm of a vector. These tests use the approximate Wolfe line search and the default parameters in CG_DESCENT. They are performed on PC, Intel (R) Core (TM) 2 Duo, E4600 2.40 GHz 2.39 GHz, RAM 2.00 GB, using f77 compiler. The detailed numerical results are placed on the web site https://www.researchgate.net/profile/Dongyi_Liu/publications/. Figs. 1 and 2 present the Dolan and Moré performance profiles [10] of the PDCGy and RSPDCGy algorithms relative to the metric NF þ 3NG and the number of iteration (Nite), respectively, where NF is the number of function evaluations and NG is the number of gradient evaluations. The metric NF þ 3NG is suggested by Hager and Zhang [14]. The numerical results show that the performance of the RSPDy1, RSPDy2 and RSPDy3 algorithms are better than that of the PDy1 algorithm, the RSPDy2 algorithm outperforms the others in the performance profile. In addition, contrasting the PDy1 algorithm and the RSPDy1

1

P(rp,s ≤ τ : 1≤ s ≤ ns)

0.95

0.9

0.85

0.8

PDy1 RSPDy1 RSPDy2 RSPDy3

0.75

0.7

0

5

10

τ

15

20

25

Fig. 1. Performance based on the metric NF þ 3NG for PDy1, RSPDy1, RSPDy2 and RSPDy3.

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

345

1

P(rp,s ≤ τ : 1≤ s ≤ ns)

0.95

0.9

0.85

0.8

PDy1 RSPDy1 RSPDy2 RSPDy3

0.75

0.7

0

5

10

15

20

τ

25

30

Fig. 2. Performance based on Nite for PDCGy and RSPDCGy.

1

P(rp,s ≤ τ : 1≤ s ≤ ns)

0.95

0.9

0.85

0.8

PDy1 RSPDy2 CG_DESCENT

0.75

0.7

0

5

10

15

τ

20

25

Fig. 3. Performance based on the metric NF þ 3NG for PDy1, RSPDy2 and CG_DESCENT.

1

P(rp,s ≤ τ : 1≤ s ≤ ns)

0.95

0.9

0.85

0.8

PDy1 RSPDy2 CG_DESCENT

0.75

0.7

0

5

10

15

τ

20

25

Fig. 4. Performance based on the number of iteration for PDy1, RSPDy2 and CG_DESCENT.

algorithm, we find that PDy1 algorithm achieves the minimum number of iterations in 44 problems, the RSPDy1 algorithm does in 24 problems and there are 622 problems that they achieve the same number of iterations. These results show that the restriction (15) for yTk sk (or (16) for yTk dk ) not only assures the global convergence for nonconvex functions, but promotes the performance of the algorithm PDCGy to some extent.

346

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347 Table 1 The contrast of Pminit and Peq for every two algorithms. Alg

PDy1 vs. RSPDy2

RSPDy2 vs. CG_DE

PDy1 vs. CG_DE

Pminit (Peq)

166,194 (329)

271,237 (187)

262,233 (192)

Table 2 The contrast for the number of fail for different algorithms. Alg

PDy1

RSPDy1

RSPDy2

RSPDy3

CG_DESCENT

Nfail

30

22

19

20

22

In what follows, we compare the performance of CG_DESCENT algorithm with the RSPDy2 algorithm and the PDy1 algorithm (the original and unrestricted algorithm). Figs. 3 and 4 present the Dolan and Moré performance profiles of the PDy1, RSPDy2 and CG_DESCENT algorithms relative to the metric NF þ 3NG and Nite, respectively. The three algorithms can simultaneously solve 686 out of 720 problems successfully, where the PDy1 algorithm achieves the minimum number of iterations in 89 problems, the RSPDy2 algorithm does in 123 problems and the CG_DESCENT algorithm does in 190 problems, there are 168 problems that they achieve the same number of iterations. For any one of the remaining 34 problems, at least, one of the three algorithms can not solve it. Table 1 gives the comparison of every two algorithms, where CG_DE is the abbreviation of the CG_DESCENT algorithm, Pminit stands for the number of problems that an algorithm arrives at the minimum number of iterations versus another, Peq in the parentheses stands for the number of problems that the two algorithms achieve the same number of iterations. For example, the group of numbers: 166, 194(329) in the column: PDy1 vs. RSPDy2, means that the PDy1 algorithm achieves the minimum number of iterations in 166 problems, the RSPDy2 algorithm does in 194 problems and there are 329 problems that they achieve the same number of iterations. They can simultaneously solve 689 (the sum of 166, 194 and 329) problems successfully. In the end, Table 2 lists the numbers (Nfail) of problems that these algorithms (Alg) terminate improperly. All in all, the preliminary numerical results show that the influence of the restriction on the algorithm PDCGy is active, the RSPDCGy algorithm (RSPDy2) outperforms the PDCGy algorithm (PDy1), and is competitive with the CG_DESCENT algorithm. 3. Conclusions and remarks If we denote the angle between yk and sk by /k , then it from the restriction (15) and (32) follows that kyk k cosð/k Þ P g0 under the condition (4), where g0 is a positive constant, which means that the projection of yk on sk is positive. When the condition (4) does not hold, we can restrict yTk sk as follows:

gsk ¼ yTk sk ; as yTk sk P gksk k2 and gsk ¼ ksk k2 ; otherwise: Remark 2.4 shows that the maximum eigenvalue of P Tkþ1 P kþ1 ,

kmax ¼ maxf1; c2 gxk ¼ maxf1; c2 g cos2 ð/k Þ: k So, the angle /k is also an important factor to influence the conjugate gradient algorithms. In addition, the restriction for yTk sk also provides a new technology for constructing more effective conjugate gradient algorithms. The spectral method presented in this paper also can be applied to quasi-Newton methods, in this case, the iteration matrix is positive definite symmetrical. This method even can be applied to the other optimization problems. Therefore, this method is worthy of studying further. Acknowledgements The authors thank N. Andrei for the Fortran code, evalfg.for, W.W. Hager and H. Zhang for CG_DESCENT code and J. J. Moré for Matlab code, perf.m. References [1] M. Al-Baali, Descent property and global convergence of the Fletcher–Reeves method with inexact line-search, IMA J. Numer. Anal. 5 (1985) 121–124. [2] N. Andrei, An unconstrained optimization test functions collection, Adv. Model. Optim. 10 (2008) 147–161. [3] N. Andrei, A modified Polak–Ribiere–Polyak conjugate gradient algorithm for unconstrained optimization, Optimization (2010), http://dx.doi.org/ 10.1080/02331931003653187. [4] I. Bongartz, A.R. Conn, N.I.M. Gould, P.L. Toint, CUTE: constrained and unconstrained testing environments, ACM Trans. Math. Softw. 21 (1995) 123– 160. [5] Y.H. Dai, Y.X. Yuan, A nonlinear conjugate gradient with a strong global convergence property, SIAM J. Optim. 10 (1999) 177–182. [6] Y.H. Dai, J.Y. Han, G.H. Liu, D.F. Sun, H.X. Yin, Y. Yuan, Convergence properties of nonlinear conjugate gradient methods, SIAM J. Optim. 10 (1999) 348– 358.

D. Liu et al. / Applied Mathematics and Computation 240 (2014) 339–347

347

[7] Y.H. Dai, L.Z. Liao, New conjugacy conditions and related nonlinear conjugate gradient methods, Appl. Math. Optim. 43 (2001) 87–101. [8] Y.H. Dai, Conjugate gradient methods with Armijo-type line searches, Acta. Math. Appl. Sin. English Ser. 18 (2002) 123–130. [9] Z.F. Dai, B.S. Tian, Global convergence of some modified PRP nonlinear conjugate gradient methods, Optim. Lett. 5 (4) (2011) 615–630, http:// dx.doi.org/10.1007/s11590-010-0224-8. [10] E.D. Dolan, J.J. Moré, Benchmarking optimization software with performance profiles, Math. Prog. Ser. A 91 (2002) 201–213. [11] J.C. Gilbert, J. Nocedal, Global convergence properties of conjugate gradient methods for optimization, SIAM J. Optim. 2 (1992) 21–42. [12] W.W. Hager, H. Zhang, A new conjugate gradient method with guaranteed descent and an efficient line search, SIAM J. Optim. 16 (2005) 170–192. [13] W.W. Hager, H. Zhang, A survey of nonlinear conjugate gradient methods, Pac. J. Optim. 2 (2006) 35–58. [14] W.W. Hager, H. Zhang, Algorithm 851: CGDESCENT, a conjugate gradient method with guaranteed descent, ACM Trans. Math. Softw. 32 (1) (2006) 113–137. [15] D.Y. Liu, G.Q. Xu, Applying Powell’s symmetrical technique to conjugate gradient methods, Comput. Optim. Appl. 49 (2) (2011) 319–334, http:// dx.doi.org/10.1007/s10589-009-9302-1. [16] D.Y. Liu, Y.F. Shang, A new perry conjugate gradient method with the generalized conjugacy condition, in: International Conference on Computational Intelligence and Software Engineering (CiSE 2010), Dec. 10–12, 2010. [17] A. Perry, A modified conjugate gradient algorithm, Oper. Res., Technical notes, 1978, pp. 1073–1078. [18] Z. Wei, G. Li, L. Qi, New nonlinear conjugate gradient formulas for large-scale unconstrained optimization problems, Appl. Math. Comput. 179 (2006) 407–430. [19] Z. Wei, S. Yao, L. Liu, The convergence properties of some new conjugate gradient methods, Appl. Math. Comput. 183 (2006) 1341–1350. [20] P. Wolfe, Convergence conditions for ascent methods, SIAM Rev. 11 (1969) 226–235. [21] P. Wolfe, Convergence conditions for ascent methods. II: some corrections, SIAM Rev. 13 (1971) 185–188. [22] H. Yabe, M. Takano, Global convergence properties of nonlinear conjugate gradient methods with modified secant condition, Comput. Optim. Appl. 28 (2004) 203–225. [23] G. Yu, Y. Zhao, Z. Wei, A descent nonlinear conjugate gradient method for large-scale unconstrained optimization, Appl. Math. Comput. 187 (2007) 636–643. [24] L. Zhang, W.J. Zhou, D.H. Li, Global convergence of a modified Fletcher–Reeves conjugate gradient method with Armijo-type line search, Numer. Math. 104 (2006) 561–572. [25] G. Zoutendijk, Nonlinear programming, computational methods, in: J. Abadie (Ed.), Integer and Nonlinear Programming, North-Holland, Amsterdam, 1970, pp. 37–86.