A Retrospective Trust-Region Method for Unconstrained Optimization

Report 4 Downloads 69 Views
A Retrospective Trust-Region Method for Unconstrained Optimization F. Bastin1 1 Computing

V. Malmedy2,3 M. Mouffe4 D. Tomanos2,5

Ph. Toint2

Science and Operational Research Department, University of Montreal 3 F.R.S.-FNRS, Research Fellow of Mathematics, FUNDP 4 CERFACS, Toulouse 5 FRIA, Research Fellow ([email protected])

2 Department

AMA Workshop, Hong Kong, June 2008

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

1/1

Outline

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

2/1

Introduction

The problem

Unconstrained optimization: min f (x)

x∈IRn

with objective function f : IRn → IR nonlinear, twice-continuously differentiable, and bounded below no convexity assumption

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

3/1

Introduction

Basic Trust-Region method

Basic Trust-Region method (BTR) Until convergence: 1

choose a local model mk of the objective f around xk

2

compute a trial point xk + sk that decreases the model mk within a trust-region ksk k ≤ ∆k

3

compute the reduction ratio ρk :=

4

f (xk ) − f (xk + sk ) mk (xk ) − mk (xk + sk )

if mk and f agree at xk + sk , i.e. ρk ≥ η1 then accept trial point: xk+1 = xk + sk update the trust region radius: ∆k+1 ∈

(

[∆k , ∞) [γ2 ∆k , ∆k )

if ρk ≥ η2 if ρk ∈ [η1 , η2 )

else reject trial point: xk+1 = xk reduce the trust region radius: ∆k+1 ∈ [γ1 ∆k , γ2 ∆k ) with 0 < η1 ≤ η2 < 1 and 0 < γ1 ≤ γ2 < 1 Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

4/1

Introduction

Roles of reduction ratio

Roles of reduction ratio — Main idea of new method

In BTR, the reduction ratio ρk plays two roles: 1

acceptance of the trial point xk + sk

2

control of the trust-region radius update

Idea: distinguish these two roles, since: 1

acceptance step based on how well the current model mk predicts the decrease of the function f at xk + sk

2

updated radius used to define where the new model mk+1 is trusted to agree with the function f around xk + sk

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

5/1

Retrospective Trust-Region method

Retrospective Trust-Region method (RTR) Until convergence: 1

choose a local model mk of the objective f around xk

2

if former trial point was rejected then reduce the trust-region radius: ∆k ∈ [γ1 ∆k−1 , γ2 ∆k−1 ) else compute f (xk−1 ) − f (xk ) ρ˜k := mk (xk−1 ) − mk (xk ) and update the trust-region radius: ∆k ∈

(

[∆k−1 , ∞) [γ2 ∆k−1 , ∆k−1 )

if ρ˜k ≥ η˜2 if ρ˜k ∈ [˜ η1 , η˜2 )

3

compute a trial point xk + sk decreasing the model mk within ksk k ≤ ∆k

4

compute the reduction ratio ρk :=

5

if ρk ≥ η1 , accept trial point: xk+1 = xk + sk ; otherwise reject trial point: xk+1 = xk

f (xk )−f (xk +sk ) mk (xk )−mk (xk +sk )

with 0 < η1 < 1, 0 < η˜1 ≤ η˜2 < 1 and 0 < γ1 ≤ γ2 < 1 Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

6/1

Retrospective Trust-Region method

Graphically. . . (1) HAIRY - iteration 38 350 345 340 f

335 xk

f(x+s)

330 325 320

xk+1

mk

mk+1

315 310 305 300

Dk+1 0

Dk+1 0.1

0.2

Dk 0.3

0.4

0.5

0.6

s

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

7/1

Retrospective Trust-Region method

Graphically. . . (2) HAIRY - iteration 25 506

504

502 xk

f(x+s)

500

xk+1

498

f 496

494

mk mk+1

492 Dk+1 490

-0.1

Dk+1 -0.05

0

Dk 0.05

0.1

0.15

0.2

0.25

s

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

8/1

Convergence theory

Convergence theory

RTR no more covered by classical theory ⇒ need of an adapted convergence theory

Assume: ∇xx f and ∇xx mk uniformly bounded first-order coherent models: ∇x f (xk ) = ∇x mk (xk ) sufficient decrease condition (at least a fraction of Cauchy point): mk (xk ) − mk (xk + sk ) ≥ γkgk k min(kgk k/βk , ∆k )

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

9/1

Convergence theory

First-order convergence

First-order convergence Where changes occurs? Let δk m := m(xk ) − m(xk+1 ) be the reduction of model m at iteration k. Then |δk mk − δk mk+1 | ≤ κ∆2k . If gk 6= 0 and ∆k ≤ ζkgk k, then iteration k is successful and ∆k grows. Finally, same results: If only finitely many successful iterations, then after some time, xk = x∗ which is first-order critical. lim k∇x f (xk )k = 0.

k→∞ Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

10 / 1

Convergence theory

Second-order convergence

Second-order convergence (1)

Assume moreover: asymptotically second-order coherent models near first-order critical points: k∇xx f (xk ) − ∇xx mk (xk )k → 0 when kgk k → 0 Where changes occurs? Suppose that mki (xki ) − mki (xki + ski ) ≥ νkski k2 and that ski → 0. Then iteration k is successful and ∆ki grows.

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

11 / 1

Convergence theory

Second-order convergence

Second-order convergence (2) Assume furthermore: ∇xx mk Lipschitz continuous if τk := λmin (∇xx mk ) < 0, then mk (xk ) − mk (xk + sk ) ≥ ξ|τk | min(τk2 , ∆2k ) Finally, same results: Suppose that {xk } remains in a compact set. Then there exists at least one limit point x∗ that is second-order critical. Suppose that ∆k+1 ∈ [γ3 ∆k , γ4 ∆k ] whenever ρ˜k ≥ η˜2 (with γ4 ≥ γ3 > 1). Then every limit point x∗ is second-order critical.

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

12 / 1

Numerical experiments

Numerical experiments

146 unconstrained problems from CUTEr library (Gould, Orban, Toint, 2003) with size between 2 and 500 matlab implementation classical parameters for TR as advised by Conn, Gould, Toint (2000) exact quadratic model subproblem solved with More-Sorensen method stopping criterion: kgk k ≤ 10−5 or more than 105 iterations

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

13 / 1

Numerical experiments

Performance profile Comparison between the retrospective and the basic TR algorithm 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5

Retrospective TR Basic TR 1

Ph. Toint (FUNDP)

1.5

2

2.5

Retrospective Trust-Region

3

3.5

4 June 2008

14 / 1

Conclusions and perspectives

Conclusion and perspectives Conclusions exploitation of the most recent model information first- and second-order convergence theory improved numerical performances no supplementary cost Perspectives Stochastic programming (dynamic accuracy on the objective function computation) Combination with ACO methods? Thank you for your attention!

Ph. Toint (FUNDP)

Retrospective Trust-Region

June 2008

15 / 1