Nonlinearly constrained optimization via sequential regularized linear programming by Mitch Crowe
B.Sc., The University of British Columbia, 2007
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in The Faculty of Graduate Studies (Computer Science)
The University of British Columbia (Vancouver) August 2010 c Mitch Crowe 2010 °
ii
Abstract This thesis proposes a new active-set method for large-scale nonlinearly constrained optimization. The method solves a sequence of linear programs to generate search directions. The typical approach for globalization is based on damping the search directions with a trust-region constraint; our proposed approach is instead based on using a 2-norm regularization term in the objective. Numerical evidence is presented which demonstrates scaling inefficiencies in current sequential linear programming algorithms that use a trust-region constraint. Specifically, we show that the trust-region constraints in the trustregion subproblems significantly reduce the warm-start efficiency of these subproblem solves, and also unnecessarily induce infeasible subproblems. We also show that the use of a regularized linear programming (RLP) step largely eliminates these inefficiencies and, additionally, that the dual problem to RLP is a bound-constrained least-squares problem, which may allow for very efficient subproblem solves using gradient-projection-type algorithms. Two new algorithms were implemented and are presented in this thesis, based on solving sequences of RLPs and trust-region constrained LPs. These algorithms are used to demonstrate the effectiveness of each type of subproblem, which we extrapolate onto the effectiveness of an RLP-based algorithm with the addition of Newton-like steps. All of the source code needed to reproduce the figures and tables presented in this thesis is available online at http://www.cs.ubc.ca/labs/scl/thesis/10crowe/
iii
Table of Contents Abstract
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iii
List of Tables
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi
Acknowledgments
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii
1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Notation and definitions . . . . . . . . . . . . . . . . . . . . . .
1 1 2
2 Current approaches for solving nonlinear programs 2.1 Interior-point methods . . . . . . . . . . . . . . . . . 2.2 Active-set methods . . . . . . . . . . . . . . . . . . . 2.3 Sequential quadratic programming . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
5 5 6 7
3 Sequential linear–quadratic 3.1 Local algorithm . . . . . 3.2 Globalization . . . . . . . 3.3 Infeasible subproblems .
programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
9 9 10 13
4 Sequential regularized linear programming 4.1 Algorithm outline . . . . . . . . . . . . . . 4.2 Infeasible subproblems . . . . . . . . . . . 4.3 Step acceptance . . . . . . . . . . . . . . . 4.4 Regularization parameter updates . . . . . 4.5 Lagrange multiplier estimates . . . . . . . 4.6 Algorithm statement . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
16 16 17 17 19 21 22
5 Experiments . . . . . . . . . . . . . . . 5.1 A benchmark trust-region algorithm 5.2 Testing environment . . . . . . . . . 5.3 Implementation . . . . . . . . . . . 5.4 Results . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
25 25 25 27 30
. . . . .
. . . . .
. . . . .
. . . . .
Table of Contents 6 Discussion . . . . . . . . . . . . . . . . . . 6.1 Expensive infeasible subproblems . . . 6.2 Slow active-set updates . . . . . . . . . 6.3 Inefficient warm-starts . . . . . . . . . .
iv . . . .
31 31 32 33
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
7 Conclusions
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Appendices A Derivation of the active sets for TrLP subproblems . . . . . . .
42
B SRLP Results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
C STrLP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
v
List of Tables 4.1
First-order KKT conditions for NLP and RLP . . . . . . . . . . . .
22
5.1 5.2
Relevant SQOPT parameters used for solving RLP. . . . . . . . . Relevant SNOPT parameters used for solving MINFR. . . . . . . .
28 30
B.1 Definition of symbols used in Table B.2. . . . . . . . . . . . . . . B.2 SRLP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 47
C.1 Definition of symbols used in Table C.2. . . . . . . . . . . . . . . C.2 STrLP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48 52
vi
List of Figures 3.1
Infeasibility induced by a trust-region . . . . . . . . . . . . . . .
13
6.1 6.2 6.3 6.4 6.5 6.6
Infeasible subproblems for SRLP and STrLP . . . . . . . . . . . Warm-start quality of STrLP on HIMMELBI . . . . . . . . . . . Warm-start quality of SRLP on HIMMELBI . . . . . . . . . . . Warm-start quality cummulative distribution function . . . . . . Minor iterations per iteration on HIMMELBI . . . . . . . . . . . Minor iterations per iteration cummulative distribution function
32 34 35 36 37 37
vii
Acknowledgments I owe a huge debt of gratitude to my superviser Michael Friedlander. This work would not have been possible without his enthusiastic support and encouragement. Michael allowed me the freedom to explore and to truly enjoy this research, while at the same time managing to keep me on track and focused. I was very fortunate to have him as my supervisor. A special thanks goes to my labmate Ewout van den Berg for his tireless help and for being a great friend. Many thanks go also to Chen Grief both for his helpful suggestions with regards to this work, and for his very kind support throughout my degree.
Mitch Crowe
Vancouver August, 2010
1
Chapter 1
Preliminaries 1.1
Introduction
Nonlinear programs (NLP), which can be written generically as NLP
minimize n x∈R
f (x)
subject to c(x) = 0 x ≥ 0,
(1.1)
have a broad range of applications in engineering, science, and economics. In this thesis, we assume that f : Rn → R and c : Rn → Rm are twice-continuously differentiable functions. A huge body of work on solving NLP has been produced in the last 50 years, and a vast and diverse array of algorithms has been developed to solve problems of this form. Even so, one can easily choose an f or a c so that none of the current methods will suffice. In truth, NLP describes such a broad class of optimization problems that the best we can hope to do is to expand the small domain of those we can solve. As computing power grows, real world applications are begining to demand the solution to very large-scale NLPs—possibly including millions of variables and constraints. Unfortunately, this is a domain in which current algorithms have proven to be insufficient. The focus of this thesis is to explore an algorithm which scales gracefully on such problems. The sequential quadratic programming (SQP) method is a staple for solving NLP [1]. Here, an algorithm successively makes quadratic models of NLP and solves an inequality-constrained quadratic program to generate a new iterate. The trouble is that solving a quadratic program can become very expensive as problem dimensions grow and, as such, SQP methods can be inefficient at solving large-scale NLPs. More recently, a variant of SQP was proposed in order to help it scale to larger problems [7, 3, 5]. This is known as sequential linear–quadratic programming (SL–QP). The idea is rather simple: instead of solving an expensive IQP at each iteration, we solve a cheaper linear program (LP), followed by an equality-constrained quadratic program (EQP). Conclusive evidence has not yet been able to demonstrate the effectiveness of currently implemented SL–QP algorithms; however it is our belief that this is due to several sources of inefficiencies in these algorithms which are bottlenecks as problem dimensions grow. This thesis will discuss these inefficiencies, and
1.2. Notation and definitions
2
propose a significant modification to the basic SL–QP method which is meant to alleviate them. Specifically, we propose a different means of controlling the step-length generated by the LP subproblem at each iteration: current methods use a trust-region constraint, where we use regularization. In the next two chapters, we discuss the current landscape of algorithms for solving NLP. Chapter 4 then presents a new algorithm, which we call sequential regularized linear programming (SRLP). This is not an SL–QP algorithm, because it does not solve an EQP subproblem at each iteration. However, since convergence guarrantees of current SL–QP algorithms rely solely on the LP step—with the EQP step added simply to improve convergence rates [3, 5], this algorithm will provide a suitable means of testing the effectiveness of regularized LP subproblems versus those with a trust-region constraint added. Chapter 5 describes the implementation details of the SRLP algorithm, together with our testing environment. Finally, Chapter 6 discusses the advantages of solving RLP versus TrLP subproblems. We demonstrate several sources of inefficiencies in current SL–QP algorithms, and show how an RLP-based algorithm would fair better.
1.2
Notation and definitions
This section presents a brief review of some of the optimization background required for a proper reading of this thesis. Along the way, we will encounter some important definitions and notation that are used throughout this text. This review is not meant to be comprehensive, and the curious reader is directed to the excellent book by Nocedal and Wright [13] for a more detailed background to this material.
1.2.1
Optimization basics
The field of optimization can be roughly summarized as the pursuit to find a vector x ∈ Rn which minimizes a function f : Rn → R, where x is limited to lie within a constraint set, Ω. We can write this as the generic optimization problem P
minimize x
f (x)
subject to x ∈ Ω . Here, f (x) is the objective, and Ω the feasible set. If Ω ≡ ∅ (i.e., if there is no x which satisfies the condition x ∈ Ω) we say that P is infeasible, or inconsistent. Otherwise it is feasible or consistent.
1.2.2
Nonlinear programming
Problem NLP (1.1) is a special case of P. Here c : Rn → Rm is refered to as the constraint function, the condition c(x) = 0 as the constraints, and the condition x ≥ 0 as the variable bounds. The feasible set for NLP, then, is
1.2. Notation and definitions
3
the set of points which satisfy the constraints and the variable bounds, i.e., Ω = {x : c(x) = 0 and x ≥ 0}. Formulations with more generic constraints (such as inequality constraints, c(x) ≥ 0) can easily be re-written in this form, and so we will be able to consider it without loss of generality. The Lagrangian is an important concept for nonlinear programming. Definition 1.1. The Lagrangian of NLP is L(x, y, z) := f (x) − y T c(x) − z T x , where y ∈ Rm , and z ∈ Rn ; its gradient with respect to x is given by, gL (x, y, z) := ∇x L(x, y, z) = ∇f (x) − ∇c(x)T y − z ; and, its Hessian with respect to x is given by, HL (x, y, z) := ∇2x L(x, y, z) = ∇2 f (x) −
m X
yi ∇2 ci (x) .
i=1
The first-order KKT (Karush-Kuhn-Tucker) conditions must hold at an optimal point of NLP. Definition 1.2. The point x satisfies the first-order KKT conditions (and is called a first-order KKT point) of NLP if and only if there exists vectors y ∈ Rm and z ∈ Rn such that gL (x, y, z) c(x) x z x◦z
= = ≥ ≥ =
0 0 0 0 0,
where (x ◦ z)i := xi zi . Under certain conditions on the constraints (the linear independence constraint qualification, for instance) it can be shown that any local solution solution, x? , to NLP must be a first-order KKT point; see [13, Theorem 12.1]. The converse is not necessarily guaranteed, however. This requires the satisfaction of stronger conditions collectively known as the second-order sufficient conditions, which are outside the scope of our current discussion, or convexity (see [13, Theorem 12.6]). This thesis focuses on active-set methods, which require the following definition:
1.2. Notation and definitions
4
Definition 1.3. The active set, A(x), for NLP at a given point x, is the set of variables which are at their bounds, i.e., A(x) := {i : xi = 0} . The active set at the solution to NLP—which we refer to as the optimal active set—is of special importance to active-set methods. We use the shorthand notation A? := A(x? ).
5
Chapter 2
Current approaches for solving nonlinear programs There have been many different methods proposed for solving NLP (1.1). The methods that we consider can broadly be divided into two main classes: interiorpoint and active-set methods.
2.1
Interior-point methods
The variable bounds x ≥ 0 can add a significant cost to solving NLP: the number of ways xi can be chosen active or inactive grows combinatorially with the number of variables in the problem. Interior-point methods reformulate these inequality constraints: penalizing their violation by adding a barrier function to the objective. Many different barrier functions could be employed, however a common choice is the log-barrier. The most basic log-barrier interior-point method repeatedly solves problems of the form LBNLP(µ)
minimize x
Ψ(x) := f (x) − µ
subject to c(x) = 0 ,
n X
ln (xi )
i=1
where µ > 0 is a sequence of barrier parameters such that µ → 0. The objective Ψ has the property that Ψ(x) = ∞ whenever x ≤ 0, which forces each iterate to lie strictly in the interior of the feasible set (hence the term ‘interior point’). As µ → 0 the barrier weakens, and it can be shown under mild assumptions that the solution to LBNLP(µ) approaches the solution of NLP [13, Theorem 17.3]. Algorithm 2.1 shows a basic log-barrier interior-point method. These methods were first introduced in the context of nonlinear programming by Fiaco and McCormick in 1964 [6], but were subsequently abandoned in favour of active-set methods such as sequential quadratic programming (see Section 2.3). More recently, however, they have enjoyed a strong resurgence, and have been shown to be quite effective on medium- to large-scale problems (see [18], for example).
2.2. Active-set methods
6
Algorithm 2.1: Basic log-barrier interior-point method input: x, µ > 0 1 2 3 4
while x is not optimal for NLP do x ← solve LBNLP(µ) µ ← reduce barrier parameter end
Algorithm 2.2: Generic active-set method input: A 1 2 3
while A 6= A? (the optimal active set for NLP) do A ← generate new active set prediction end
2.2
Active-set methods
If A? —an optimal active set for NLP—was known apriori, one could simply determine an optimal point for NLP by fixing the variables in this set at their bounds, ignoring other bound constraints, and solving a much cheaper equalityconstrained problem. In this sense, much of the cost of solving NLP can be said to be the determination of an optimal active set. Where interior-point methods skirt this issue by removing the bound constraints, active-set methods seek the optimal active set directly by successively updating a sequence of predictions Ak . Although interior-point methods can be very robust and efficient, one clear advantage of active-set methods is in warm-starting.
2.2.1
Warm-starting
When information received from solving one problem is used to speed the convergence of a second, related, problem, we say that we have warm-started the second problem. Active-set methods are readily warm-started. Two problems which are only slightly perturbed from each other are likely to have very similar optimal active sets, and we can use the optimal active set from one as an initial prediction for the optimal active set of the second. In contrast, interior-point methods are very difficult to warm-start effectively. (Strong efforts have been made recently to improve interior-point warm-starting, but success is still limited compared to that enjoyed by active-set methods [11].) There is a vast array of applications which require the solution to many similar NLPs. Consider, for instance, the branch-and-bound method for mixed integer nonlinear programming, where a related NLP is solved at every major iteration [2]. In such applications, effective warm-starting can be crucial to
2.3. Sequential quadratic programming
7
Algorithm 2.3: Sequential quadratic programming input: x, A 1 2 3 4
while x is not optimal for NLP do d, A ← solve IQP(x)[A] x←x+d end
the success of an algorithm: potentially, the cost of solving a series of related problems can be much less than that of solving them individually. As such, because they are readily warm-started, and in spite of the significant successes of interior-point methods, we believe the pursuit of active-set methods for largescale NLPs to be a very important topic. The algorithms in the remainder of this thesis, all of them active-set methods, make heavy use of warm-starting. Each of them successively solves a series of related subproblems. The optimal active set from the most recently solved subproblem, A, is used as an initial prediction of the optimal active set for the following subproblem. The efficiency of these warm-starts plays a crucial role in the effectiveness of the algorithm in question, as we discuss in Chapter 6. As a reminder that we are using an active-set prediction for warm-starting our subproblems, algorithms in this thesis write this explicitly. When an algorithm states, for instance, “solve IQP(x)[A]”, it is understood that the subproblem IQP(x) is solved using A as an initial active-set prediction. (An important technical detail must be addressed here. The set of active variables defined by some A, together with a set of equality constraints, c(x) = 0, may define an inconsistent system. In this case we say that the active set is degenerate. Practical algorithms choose a non-degenerate subset of A—a ‘working set’ [3, 19, 10]. Unless otherwise stated, we assume throughout this thesis that the active sets generated by the algorithms in question are non-degenerate.) Many successful active-set algorithms exist. Perhaps the most well-known is the simplex method for linear programming [13, Chapter 13]. For nonlinear programming, however, sequential quadratic programming (SQP) [1] has proven to be one of the most effective active-set methods, as demonstrated by the success of algorithms such as SNOPT [10].
2.3
Sequential quadratic programming
SQP is, essentially, a generalization of Newton’s method to the constrained optimization case. At each iteration, SQP methods form a quadratic model of NLP about the current iterate—a quadratic model of the Lagrangian, and a linear model of the constraints—and minimizes this model to determine a new
2.3. Sequential quadratic programming
8
iterate. Its subproblems can be written as IQP(x)
minimize d
gL (x)T d + 12 dT HL (x)d
subject to c(x) + J(x)d = 0 x+d ≥ 0. Algorithm 2.3 lists a basic SQP algorithm (note that this is simply a local algorithm; we will discuss globalization techniques in Section 3.2). SQP allows for any method of solving IQP. Active-set quadratic-programming methods [16], however, are a common choice because of their warm-start efficiency. In this case, SQP itself is an active-set method—iteratively updating an estimate, A, of the optimal active set, A? , for NLP. Under mild conditions it can be shown that (after incorporating globalization techniques similar to those discussed in Section 3.2) iterates generated by SQP give A = A? within a finite number of steps [17], and that, once this occurs, they achieve quadratic convergence towards a solution, x? [13, Theorem 18.2].
9
Chapter 3
Sequential linear–quadratic programming 3.1
Local algorithm
As numbers of constraints and variables increase into the millions, each iteration of SQP can become very expensive [3]. This cost is dominated by solving the IQP subproblem—this may be very slow for large-scale problems considering the combinatorial difficulties implicit in the bound constraints, together with the possibility of a nonconvex quadratic objective to handle. The idea behind sequential linear–quadratic programming (SL–QP) was first introduced by Fletcher and Sainz de la Maza [7] as a means of reducing the cost of major iterations compared with SQP—more recently, algorithms were proposed by both Chin and Fletcher [5] and, Byrd, Gould, Nocedal and Waltz [3]. Instead of solving IQP, SL–QP solves at each iteration a linear model of NLP given by LP(x)
minimize d
gL (x)T d
subject to c(x) + J(x)d = 0 x+d ≥ 0,
(3.1)
in order to update its estimate of the optimal active set. Here, SL–QP bears a resemblance to sequential linear programming methods [14], which have been used for many years. However, SL-QP methods generate an additional search direction by solving the equality-constrained quadratic program, EQP(x, A)
minimize d
gL (x)T d + 12 dT HL (x)d
subject to c(x) + J(x)d = 0 di = 0 for i ∈ A . This additional step has the benefit of allowing for a superlinear convergence rate, as experienced by SQP methods. Importantly, EQP—a quadratic model of NLP in which active-set variables are fixed and other bounds are ignored— can be much cheaper to solve than IQP, as it avoids handling variable bounds. Algorithm 3.1 lists a basic SL–QP algorithm. SQP methods have a single subproblem (IQP) that gives active-set prediction and fast convergence. SL–QP methods decouple these objectives into more specialized subproblems. The hope is that the LP step will give a good prediction of
3.2. Globalization
10
Algorithm 3.1: Basic local SL–QP method input: x, A 1 2 3 4 5
while x is not optimal for NLP do A ← solve LP(x)[A] d ← solve EQP(x, A) x ← x+d end
the optimal active set, without the significant cost of solving IQP (LP is a simpler and more structured problem than IQP, and linear-programming technology is more advanced than that of quadratic-programming) and, as such, will handle large-scale problems more efficiently. Algorithm 3.1 on its own, however, is not guarranteed to converge. In the next two sections, we discuss features incorporated into Chin et al.’s algorithm and into Byrd et al.’s algorithm which allow for such guarrantees.
3.2
Globalization
Algorithm 3.1 is a local algorithm. That is, starting at a point x arbitrarily far from a solution to NLP, it may never converge to a solution. To drive iterates towards a solution from an arbitrary starting point (globalization), iterative algorithms typically employ a linesearch or a trust-region, both of which control the size of the step d taken at each iteration. The algorithms of Byrd et al. [3] and of Chin et al. [5] are both trust-region methods. Instead of solving LP, which may be a very poor model of NLP far from the current iterate x, they solve TrLP(x, ∆)
minimize d
gL (x)T d
subject to c(x) + J(x)d = 0 x+d≥0 kdk∞ ≤ ∆ ,
(3.2)
where the step d is forced to lie within an infinity-norm ball of radius ∆ (called the trust-region radius). The goal is to force the generated step to lie within a region in which we trust our model of NLP. In a trust-region approach, if a step is deemed to have made sufficient progress towards a solution, ∆ is held constant—or even increased to allow bigger (bolder) steps. Otherwise, our model of NLP may be too poor, and ∆ must be reduced to take smaller (safer) steps. But how do we determine whether a step gives sufficient progress? Clearly a step which reduces the objective and brings us closer to the feasible region for NLP should be taken. By the same token, one which increases the objective and moves us farther away from the feasible region should not be taken. However,
3.2. Globalization
11
what if a step generated by solving TrLP significantly reduces the objective, but moves us farther away from the feasible region of NLP? The answer is not obvious. Clearly some tradeoff between the goals of minimizing the objective and becoming feasible must be made. Answers to how to make this tradeoff have been the subject of a great deal of research, and this is an issue on which the algorithms of Byrd et al. [3] and of Chin et al. [5] differ. Byrd et al. employ a merit function and Chin et al. use a filter. We discuss both ideas now.
3.2.1
Merit functions
For unconstrained problems, sufficient progress towards a solution can simply be equated to sufficient descent in the objective; this single function defines the goal of a step. Merit functions seek to generalize this concept to constrained optimization. They combine the objective and constraint violation into a single function, φ, known as the merit function. Sufficient progress is then determined by whether or not a step generates sufficient descent on this combined function. One way to measure constraint violation is via the function h(x) := kc(x)k1 .
(3.3)
(The algorithms we consider explicitly enforce bound constraints at each iteration, and so we do not consider bound violation here.) A common merit function, and the one employed by Byrd et al., is the `1 merit function φ1 (x, ν) := f (x) + ν h(x) ,
(3.4)
where ν > 0. This is an exact merit function [13, Section 15.3]. Definition 3.1. A merit function φ(x, ν) is exact if there exists some ν ? > 0 such that for any ν > ν ? , every local solution of NLP corresponds to a local minimizer of φ(x, ν) [13, Definition 15.1]. It can be shown that, for the `1 merit function, ν ? := ky ? k∞ ,
(3.5)
where y ? are the Lagrange multipliers at a solution to NLP, satisfies Definition 3.1 [13]. We refer to ν as the penalty parameter; it controls the amount of penalization given to constraint violations. Choosing an appropriate value, ν k , for this penalty parameter at each iteration k, is a difficult task. Convergence proofs for an algorithm which uses an exact merit function typically require that the algorithm eventually generates iterates with ν k > ν ? . Perhaps the simplest solution is to choose some ν0 À ν ? and to set ν k = ν0 at every iteration. This is not effective in practice for several reasons, however. First, we do not know y ? (or, consequently, ν ? ) until we’ve actually solved the problem, so it is impossible to know for sure whether ν0 is large enough. Second, there may be multiple solutions to NLP, and we may thus be required to know y ? for all of them. Third, it
3.2. Globalization
12
has been observed in practice that if ν k is chosen too large convergence can be very slow, and, so, choosing an arbitrarily large ν0 may be very inefficient [3]. Despite this difficulty, heuristics have been developed for choosing ν k that have been shown to be effective in practice [4, 13]. We discuss one such heuristic in Section 4.3.
3.2.2
Filters
The `1 merit function combines the goals of minimizing f (x) and h(x). In contrast, filter methods, introduced by Fletcher and Leyffer [8], view these as two separate and competing goals, i.e., minimize f (x) and minimize h(x) . Instead of instituting an arbitary trade-off between these goals, a filter method considers only¢ whether the objective–infeasibility pair at a given point, (f1 , h1 ) := ¡ f (x1 ), h(x1 ) , is unequivocally worse than another, (f2 , h2 ). If so, we say that the latter dominates the former. Definition 3.2. A pair (f2 , h2 ) dominates another pair (f1 , h1 ) if and only if f2 ≤ f1 and h2 ≤ h1 [8, Definition 1]. In the basic filter method, a step is acceptable if the new objective–infeasibility pair it would give, (f, h), is not dominated by that of any previous step taken by the algorithm. This is determined by maintaining a list of previously accepted pairs—a filter. Each time a step is accepted, its pair may be added to the filter (and, for efficiency, any pairs in the filter which it dominates are subsequently removed). Definition 3.3. A filter is a list of pairs (f, h) such that no pair dominates any other [8, Definition 2]. A step has sufficient progress if it is acceptable to the filter. Definition 3.4. A step ¡ d generated about ¢ the current iterate x is acceptable to a filter if the pair f (x + d), h(x + d) is not dominated by any pair in the filter. The SL–QP algorithm of Chin et al. [5] uses a filter. Two of the main draws of this globalization method are that fewer steps are rejected by a filter than a merit function (it is, in some sense, as accepting as possible), and that no arbitrary parameter is required. However, while the basic idea quite attractive due to its elegance, many less attractive features must be added in practice to ensure convergence [8]. In particular, the notion of an envelope (which introduces the very tradeoff between f and h we were trying to avoid), and the requirement that certain iterations neglect filter acceptability and instead give descent on the objective, detract from the elegance of the method. Effectiveness of filters in practice is also less established than that of merit functions.
3.3. Infeasible subproblems
13
∆ x
c(x) + J(x) d = 0
I Figure 3.1: A small trust region can induce a TrLP subproblem to be infeasible. Here, every point on the line c(x) + J(x)d = 0 is feasible for the corresponding LP subproblem. However, none of these points lie within the trust-region radius, ∆, of x and so the TrLP subproblem is infeasible.
3.3
Infeasible subproblems
Algorithm 3.1 assumes that we can solve TrLP at every iteration. In practice, however, this is not always possible. Assuming that the original NLP is feasible, there are two ways that the TrLP subproblem can be infeasible. First, the linearized constraints about a point, x, may themselves be infeasible: if the constraint gradients, ∇ci (x)T , are degenerate, then the system ∇c(x)T d = −c(x) may simply have no solution. Second, if the trust-region is too small, the TrLP subproblem may be infeasible even if the corresponding LP subproblem was not. Consider Figure 3.1. Here, the linearized constraints, c(x) + J(x)d = 0, while themselves consistant, cannot be satisfied within the trust-region radius, ∆, of x, and so the TrLP subproblem is infeasible. If an infeasible subproblem is encountered, a successful algorithm must have a means of determining a new step and proceeding with its iterations. Two common techniques for this are feasibility restoration—which is utilized in Chin et al.’s algorithm [5], and penalty reformulation—which Byrd et al. use [3]. We explore these alternatives next.
3.3.1
Feasibility restoration
If x is feasible for NLP, then the subproblem TrLP(x, ∆) is guarranteed to be feasible for any ∆ (we can see immediately that a step d = 0 satisfies the constraints in TrLP when c(x) = 0 and x ≥ 0). Feasibility restoration searches for such a feasible point, so that we can be sure that the next TrLP subproblem to solve will be feasible. It was introduced by Fletcher and Leyffer [8] specifically
3.3. Infeasible subproblems
14
for use with the filter method, however there is nothing in the method that prevents one from using it with a merit-function method. A feasibility restoration step attempts to find a feasible point by solving the problem FR
minimize x
h(x)
subject to x ≥ 0 ,
(3.6)
where h is defined in Equation 3.3. If the objective at a solution to FR is nonzero, then we could not find a feasible point, and NLP is declared infeasible. Otherwise, we have found a feasible point for NLP, and may proceed with the regular iterations of our algorithm. Any algorithm which can minimize a nonlinear objective subject to bound constraints will be suitable to generate a feasibility restoration step. However, Fletcher and Leyffer [8] do describe a method designed specifically for this purpose. Unfortunately, feasibility restoration can be quite expensive. Indeed, FR can be almost as difficult to solve as our original NLP! Another potential weakness of feasibility restoration is that it has essentially given up on the objective and, as such, the step it generates may unnecessarily increase it.
3.3.2
Penalty reformulation
Penalty reformulation is a staple technique for handling infeasible subproblems. Its use in practice is described in [10], and it is also discussed in detail in [13]. Here, the view is that there is no need to satisfy the linearized constraints at every iteration. If an infeasible TrLP subproblem is encountered, a penalty reformulation method instead solves PLP(x, µ)
minimize d
gL (x)T d + µkc(x) + J(x)dk1
subject to x + d ≥ 0 kdk∞ ≤ ∆ , which is formed by replacing the explicit constraints (c(x) + J(x)d = 0) with a term in the objective that penalizes their violation. Importantly, PLP is always feasible as long as x ≥ 0, which the algorithms we consider enforce explicitly at each iteration. It is common to reformulate PLP as a linear program, thus taking advantage of the availability of highly-tuned large-scale linear programming solvers. PLP can be re-written as PR(x, µ)
minimize d,q,r
gL (x)T d + µ eT q + µ eT r
subject to c(x) + J(x)d = q − r x + d, q, r ≥ 0 kdk∞ ≤ ∆ , where q, r ∈ Rm are slack vectors, and e ∈ Rm is a vector of ones.
3.3. Infeasible subproblems
15
Two advantages of penalty reformulation over feasibility restoration are that PR is much cheaper to solver than FR, and that it includes objective information instead of throwing it away. The step generated by PR, however, may not give as much of a reduction in infeasibility as FR. Also, penalty reformulation requires the choice of a parameter, µ. Furthermore, the effectiveness of a method using penalty reformulation can be very sensitive to the choice of this parameter, and finding an appropriate value may be quite expensive in practice [4]. While PR has the same linear program form as TrLP, it has n + 2m variables compared with n for the latter—it may be significantly more expensive to solve. This, together with a potentially expensive routine for selective µ (Byrd et al. [3] suggest a full extra solve of PR!) makes penalty reformulation iterations quite costly for an algorithm.
16
Chapter 4
Sequential regularized linear programming This chapter presents a new active-set method, SRLP (sequential regularized linear programming) for solving large-scale NLPs.
4.1
Algorithm outline
In order to control the size of steps generated at each iteration, current SL–QP algorithms add a trust-region constraint to the LP subproblem, solving TrLP. In contrast, SRLP controls the step size and enforces global convergence by adding a regularization term to the LP subproblem, which penalizes overly large steps. At each iteration a regularized linear program, RLP(x, ρ)
minimize d
gL (x)T d + 12 ρkdk22
subject to c(x) + J(x)d = 0 x+d ≥ 0,
(4.1)
is solved. This idea was first proposed in a technical report by Friedlander, Gould, Leyffer, and Munson [9], but to our knowledge it has not yet been implemented. Algorithm 4.1 lists an outline of our proposed SRLP algorithm. Lines 3 and 4 handle the possibility of encountering an infeasible RLP subproblem, as discussed in Section 4.2. Lines 7 through 10 are repeated until an acceptable step, the definition of which is given in Section 4.3, is generated. Within this loop, we update the regularization parameter. A discussion of how this is done is presented in Section 4.4. Finally, line 13 updates estimates of the optimal Lagrange multipliers of NLP on line 13, which is discussed in Section 4.5. (While SRLP does not explicitly require these estimates, they are used in practice in order to determine whether optimality conditions have been satisfied, and as part of our globalization technique.) Notice that SRLP does not take an EQP step, and thus does not fully qualify as an SL–QP method. Convergence guarantees of current SL–QP algorithms, however, rely solely on the LP step, with the EQP step added only to increase convergence rate [3, 5]. As such, SRLP provides a suitable means of determining the effectiveness of using regularization versus a trust-region to control step size in an SL–QP method. A practical implemention, though, will likely include an
4.2. Infeasible subproblems
17
Algorithm 4.1: Sequential regularized linear programming (outline) input: x, ρ > 0 1 2 3 4 5 6 7 8 9 10 11 12 13
while x is not optimal for NLP do if RLP(x, ρ) is infeasible then d ← handle infeasible subproblem else d ← solve RLP(x, ρ)[A] while d is not acceptable do ρ ← update regularization parameter d ← find a new step end end x←x+d update Lagrange multiplier estimates end
/* Section 4.2 */
/* Section 4.3 */ /* Section 4.4 */ /* Section 4.4 */
/* Section 4.5 */
EQP step, and the Lagrange multiplier estimates discussed in Section 4.5 will be important for computing the Hessian of the Lagrangian in this case.
4.2
Infeasible subproblems
The RLP subproblem encountered on a given iteration may be infeasible if the linearized constraints (those in Equation 3.1) are degenerate (see Section 3.3). SRLP generates a feasibility restoration step, which solves FR (3.6), when this occurs. There are many solutions to FR. We take the view that feasibility restoration should interfere with regular iterations as little as possible, and so choose the solution which is as close to our current iterate, in a 2-norm sense, as possible. Our feasibility restoration steps solve MINFR(x)
minimize d
kdk2
subject to c(x + d) = 0 x+d ≥ 0.
(4.2)
If MINFR(x) is infeasible, SRLP declares NLP itself to be infeasible, and iterations are terminated. Otherwise, its solution d is taken as a step in lieu of an RLP generated step.
4.3
Step acceptance
Successively accepting any step derived from the solution to RLP does not guarantee convergence to a first-order KKT point of NLP from an initial point arbitrarily far away. Instead, we must employ globalization techniques. This
4.3. Step acceptance
18
requirement is common for any algorithm in which local models of a problem are used as subproblems and, as such, a large body of work has been published analysing various strategies. See Section 3.2 for a review of some relevent techniques. Our globalization strategy will define what an acceptable step is for our algorithm. When a step is deemed acceptable, the SRLP algorithm takes it, otherwise either the RLP subproblem is modified or the step is modified until it is acceptable. SRLP utilizes the `1 -exact merit function (φ1 , see Equation 3.4) to enforce globalization. While this technique has the potential downside of requiring a good choice of penalty parameter, we have found the method we describe below to work very well in practice.
4.3.1
Penalty parameter choice
Our primary concern in choosing ν k at each iteration is that φ1 be exact. Thus, following Equation 3.5, we impose the condition ν k > νa := ky k k∞ , where y k are the Lagrange multiplier estimates at the given iteration (see Section 4.5). Secondly, we require that ν k be chosen so that a solution to RLP is at least a descent direction for the merit function, φ1 (xk , ν k ) . Steps dk generated by SRLP are guaranteed to satisfy J(xk ) dk = −c(xk ) (see Equation 4.1), and the directional derivative of φ1 in such a direction is given by D(φ1 , dk ) = ∇f (xk )T dk − ν k kc(xk )k1 , [13, Lemma 18.2]. Our requirement that D(φ1 , dk ) < 0 then gives us a second condition for the penalty parameter: ν k > νb :=
∇f (xk )T dk kc(xk )k1
for c(xk ) 6= 0 .
Notice that if xk is feasible (i.e., kc(xk )k1 = 0), then descent on the merit function is not determined by ν k . In this case, however, a step d which solves RLP is always a descent direction for the objective unless we are optimal, and so descent on the merit function will be guaranteed for any value of ν k . Our final requirement amalgamates these two conditions, ( max (νa , νb ) if kc(xk )k1 6= 0 ; k ν > νmin := (4.3) νa otherwise . In practice constants ² > 0 and c ≥ 1 are chosen so that ν = c νmin + ²
(4.4)
is sufficiently large. (The specific choice of c and ² may have a large practical impact on an algorithm, and adjustment may be required to find appropriate values for a given implementation.)
4.4. Regularization parameter updates
4.3.2
19
An acceptable step
The use of an exact merit function defines a single goal for each iteration of SRLP, namely reduction of the merit function. It is well known, however, that simply achieving reduction in the merit function at each iteration may, instead of achieving convergence to a first-order KKT point of NLP, reach a non-optimal accumulation point. Instead we must require that each iteration gives at least a sufficient amount of reduction in the merit function. We define this to mean that the step achieves at least a fraction of the descent predicted by a linear model of φ1 , defined by `(d, x, ν) := f (x) + g(x)T d + νkc(x) + J(x)dk1 . We are now prepared to define an acceptable step for SRLP. Definition 4.1. A step d generated by SRLP about a current iterate x is considered acceptable if it satisfies φ1 (x, ν) − φ1 (x + d, ν) > β (φ1 (x, ν) − `(d, x, ν)) , where β ∈ (0, 1) is the required fraction of descent, and ν is chosen by Equation 4.4 (A similar condition is used in [5] and in [3].) In our algorithm we use the notion of step quality. A high-quality step will be accepted, whereas a low-quality one will not. Definition 4.2. The quality, q, of a step d is the fraction of predicted linear descent in φ1 achieved by the step. φ1 (x, ν) − φ1 (x + d, ν) . (4.5) φ1 (x, ν) − `(d, x, ν) With q given by Equation 4.5, and β our chosen required fraction of descent, our step acceptance criterion becomes q :=
q>β.
4.4
Regularization parameter updates
Consider a given iteration k of SRLP with an iterate xk , and where the current value of the regularization parameter is ρk . This section discusses how SRLP chooses ρk+1 , and concurrently generates an acceptable step. For convenience, we will herein use the notation dρ to signify the solution to the subproblem RLP(x, ρ). The regularization term in RLP, ρkdk22 , replaces the trust-region constraints, kdk∞ ≤ ∆, in TrLP. Like the trust-region, its role is to control the length of the step d taken by the subproblem at each iteration. The penalty parameter, ρ, defines the degree of penalization of the step-length in a 2-norm sense: ρ = 0 gives no penalization, and a step which freely solves LP(x); ρ = ∞ gives the
4.4. Regularization parameter updates
20
smallest possible step that satisfies the linearized constraints of NLP about x, i.e., RLP(x, ∞)
minimize d
kdk22
subject to c(x) + J(x)d = 0 x+d ≥ 0.
(4.6)
There is an important distinction between a trust-region constraint and regularization, however: A trust-region imposes a direct and strict limit on steplength, where regularization only imposes a degree of aversion to large steps. Thus, using regularization, SRLP does not have the full control over step-length, and this has important consequences for the convergence of the algorithm. In particular, notice that it is not true in general that kdρk2 → 0 as ρ → ∞ (consider any instance where c(x) 6= 0, in that case d = 0 cannot satisfy the constraints in Equation 4.6). The primary aim of controlling step-length, whether by regularization or a trust-region, is to ensure that an acceptable step is taken at each iteration. This is required by globalization techniques and thus for convergence of an algorithm. SRLP’s primary means of step-length control is through updates to the regularization parameter. When an unaccepted step dρa is generated, we predict that, in the large majority of cases, we can find some ρb > ρa so that dρb is acceptable. However, we cannot guarantee that such a ρb exists, and, if it does not, a secondary means of generating an acceptable step is required. One possible choice for this secondary step is a linesearch on some dρ. Our choice of penalty parameter (Section 4.3) ensures that dρ (for any value of ρ) is a direction of descent for the merit function, φ1 . By the twice-continuous differentiability of f and c, then, we can see that some α > 0 exists such that d := α dρ satisfies our acceptance requirement (Definition 4.1); we can find an acceptable step by performing a linesearch on any step dρ. While any dρ will suffice, d∞ (a solution to (4.6)) is in some sense the best choice: it defines the direction in which the linearized constraints are nearest—in a 2-norm sense—to being satisfied and is thus likely to give good progress towards feasibility. To maximize the efficiency of SRLP, we wish to minimize the number of RLP subproblems solved at each iteration. We expect that, on most iterations, the initial step dρa will be accepted. When it is not, we further expect that a modest increase in regularization parameter ρb = γρa (with γ > 1) will, in the majority of cases, generate an acceptable step dρb . Thus, SRLP attempts each of these steps in succession before any more expensive means of analyzing the subproblem are performed. If the second attempted step is still not acceptable (we expect this to be a rare occurance in practice), we immediately seek to determine whether there exists any ρc > ρb for which dρc is acceptable. This test is performed by solving RLP a third time to obtain d∞ . If d∞ is unacceptable, we know immediately that increasing ρ further is futile, and instead perform a linesearch on this direction. Otherwise, we know that there is some ρc large enough to generate an acceptable step and we continue to search for it.
4.5. Lagrange multiplier estimates
21
Algorithm 4.2: Regularization parameter update and step selection input: d, x, ρ > 0, γ > 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
if d is not acceptable then ρ←γρ /* try increasing ρ */ d ← solve RLP(x, ρ) if d is not acceptable then d∞ ← solve RLP(x, ∞) /* test ρ = ∞ step */ if d∞ is not acceptable then choose α ∈ (0, 1) such that α d∞ is acceptable d ← α d∞ else choose ρc > ρ such that dρc is acceptable d ← solve RLP(x, ρc ) ρ ← ρc end end else possibly decrease ρ end x←x+d
Algorithm 4.2 lists our regularization parameter update scheme.
4.5
Lagrange multiplier estimates
Estimates of the optimal Lagrange multipliers for NLP are used in order to update the penalty parameter (Section 4.3), to check whether optimality conditions are satisfied (Section 5.3.7), and, when an EQP step is added, to compute an estimate of the Hessian of the Lagrangian. We have three main criteria when choosing a source for these estimates. First, they should be inexpensive to compute. Second, they should provide a good estimate of the optimal Lagrange multipliers, based on the limited local information we have of the problem. Third, and most important, they should converge to the optimal Lagrange multipliers as we approach a solution to NLP. SRLP uses the optimal Lagrange multipliers for the RLP subproblems (the RLP estimates, see Definition 4.3) as an estimate of the optimal Lagrange multipliers of NLP (the optimal multipliers) at each iteration. These have the immediate advantage that almost any solver of RLP will provide them at no extra cost over solving the subproblem.
4.6. Algorithm statement NLP g(x? ) − J(x? )T y ? = z ? z? ≥ 0 z ? ◦ x? = 0
22
RLP g(x) + ρ d − J(x)T y¯ = z¯ z¯ ≥ 0 z¯ ◦ (x + d) = 0
I Table 4.1: First-order KKT conditions satisfied by the optimal Lagrange multipliers for NLP at an optimal point x? , and by the RLP estimates at a given. Definition 4.3. The RLP Lagrange multiplier estimates (or RLP estimates) are the pair of vectors y¯ ∈ Rm and z¯ ∈ Rn which, together with the solution d, satisfy the first-order KKT conditions to the subproblem solved at a given iteration, RLP(x, ρ) (see Table 4.1). The first-order KKT conditions for RLP and NLP give a system of equations that the RLP multipliers and the optimal multipliers must satisfy. Table 4.1 lists the relevant conditions for convenience. Consider the conditions in Table 4.1. As x → x? and, consequently, d → 0, we can see by inspection that (y ? , z ? ) satisfy the condition for RLP estimates. Furthermore, the RLP estimates are unique (by the convexity of RLP). Continuous differentiability of f and c, then, gives us (¯ y , z¯) → (y ? , z ? ) as x → x? ; the RLP Lagrange multiplier estimates approach the optimal Lagrange multipliers for NLP as we approach a solution.
4.6
Algorithm statement
A full listing of SRLP is given in Algorithm 4.3. This pseudo-code describes a process which iterates until an estimate x that is optimal for NLP is found. This stopping condition is left open to the implementation of the algorithm—we discuss the condition used in this thesis in Section 5.3.7. Throughout the rest of this thesis, we will refer to major iterations—which we define to be a full step through the loop from lines 1 to 28, and to minor iterations—which we define to be the iterations taken by the RLP subproblem solver. The cost of these minor iterations will play a very important role in the effectiveness of a practical implementation of SRLP, and will be highly dependent on the choice of a solver for the RLP subproblems. We discuss this issue in the context of our implementation of SRLP in Section 5.3.4. Note that the regularization parameter update scheme presented in Section 4.3 is directly incorporated into this flow. Algorithm 4.3 also utilizes a number of subroutine calls which we now define. d, y, A ← solve RLP(x, ρ): Solve the subproblem RLP(x, ρ) (Equation 4.1) to obtain a step, d, and Lagrange multiplier estimates, y (see Section 4.5). Compute the active set, A(x + d). d ← solve MINFR(x): Solve the subproblem MINFR(x) for a step d.
4.6. Algorithm statement ν ← penaltyParameter(d, y, x): Compute ν from Equation 4.4. q ← stepQuality(d, ν, x): Compute q from equation Equation 4.5.
23
4.6. Algorithm statement
Algorithm 4.3: Sequential regularized linear programming input: x, y, A, ρ > 0, qmin > 0, qhigh ≥ qmin 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
while x is not optimal for NLP do if RLP(x, ρ) is feasible then /* RLP step */ d, y, A ← solve RLP(x, ρ)[A] v ← penaltyParameter(d, y, x) else /* feasibility restoration step */ d ← solve MINFR(x) end if stepQuality(d, ν, x) < qmin then /* try increasing ρ */ ρ ← 2ρ d, y ← solve RLP(x, ρ)[A] if stepQuality(d, ν, x) < qmin then /* test ρ = ∞ step */ d, y ← solve RLP(x, ∞)[A] if stepQuality(d, ν, x) < qmin then /* linesearch on ρ = ∞ step */ repeat d ← 12 d until stepQuality(d, ν, x) ≥ qmin else /* increase ρ until acceptable step found */ repeat ρ ← 2ρ d, y ← solve RLP(x, ρ)[A] until stepQuality(d, ν, x) ≥ qmin end end else if stepQuality(d, ν, x) ≥ qhigh then ρ ← 12 ρ end x←x+d end
24
25
Chapter 5
Experiments This chapter discusses the details of the experiments performed for this thesis. The implementation of SRLP (Algorithm 4.3), as well as the testing environment used, is described. Determining the effectiveness of SRLP requires a benchmark, and we begin by presenting the benchmarking algorithm used in this thesis—a sequential trust-region linear programming method (STrLP).
5.1
A benchmark trust-region algorithm
Like SRLP, STrLP is an SL–QP algorithm without an EQP step. However, it generates steps at each iteration by solving a TrLP, rather than an RLP, subproblem. Algorithm 5.1 lists STrLP. It is by no means novel—in fact, it closely resembles the SL–QP implementation by Byrd et al. [3]. The details of this algorithm are meant to mimic those of SRLP as closely as possible, so that the two may be fairly compared: globalization is similarly enforced by an `1 -merit function; the penalty parameter update method and step acceptance criterion are identical to those of SRLP; and it uses the same feasibility restoration routine as SRLP. The significant difference between STrLP and SRLP is the means with which they control the length of the step generated by subproblems. For this, STrLP uses a trust-region. When the quality of a step (Equation 4.5) is deemed to too low, the trust-region radius, ∆, is reduced; if it is very high, ∆ may be increased.
5.2
Testing environment
Determining the success of an optimization algorithm is quite difficult to do subjectively. Undoubtedly, an algorithm will fair better on certain types of problems and worse on others. The Constrained and Unconstrained Testing Environment revisited (CUTEr) [12] is a set of test problems which is commonly used as a benchmarking environment for optimization algorithms. We have selected a subset of the CUTEr problems as our test set for SRLP and STrLP. Only generally-constrained problems (nonlinear programs excluding linear and quadratic programs) for which n + m < 1000 were chosen. From the 225 matching problems, we removed a number which were found to be ill suited to our analysis: 1 unconstrained problems (this problem was labeled generallyconstrained, but we found it to contain no constraints or variable bounds),
5.2. Testing environment
26
Algorithm 5.1: Sequential trust-region linear programming input: x, y, ∆ > 0, qmin , qhigh 1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
while x is not optimal for NLP do if TrLP(∆, x) is feasible then /* TrLP step */ d, y, A ← solve TrLP(∆, x)[A] else /* feasibility restoration step */ d ← solve MINFR(x) end v ← penaltyParameter(d, y, x) if stepQuality(d, ν, x) < qmin then /* don’t accept step. */ ∆ ← 12 ∆ else /* accept step */ if stepQuality(d, ν, x) ≥ qhigh then ∆ ← 2∆ end x←x+d end end
3 problems which would not compile on our system, 5 problems we found to be poorly defined in regions visited by our algorithm (either the objective or constraints did not evaluate to a number), 11 problems which SNOPT declared as infeasible, and 38 problems which required more than 3000 iterations to solve. We were left with a set of 167 problems which we refer to as our test set. The list of CUTEr problems included in this test set is given in Appendix B. Why use such small-scale problems for our analysis, when our goal is efficiency on very large scale problems? First, we are not directly concerned with the efficiency of SRLP, but rather with the use of RLP subproblems in an SL–QP method. Small-scale test problems allow us to explore a broad range of problems, which helps us to properly analyze certain features of RLP subproblems in Chapter 6. Success on these features can then be extrapolated to the efficiency of an SL–QP algorithm on large-scale problems. Second, developing a solver which robustly and efficiently handles large-scale problems in practice is a very arduous task. Many issues—such as effective use of memory, and optimized numerical routines—which can be safely forgotten on small-scale problems, require very careful attention as problems become large. At some point scaling an algorithm becomes an art rather than a science.
5.3. Implementation
5.3
27
Implementation
This section describes the details of our implementation of the SRLP and STrLP algorithms. The main logic of these algorithms was implemented in the Matlab programming language. The bulk of the computations, however, are left to highly optimized third-party Fortran routines.
5.3.1
Parameters
SRLP calls for three initial parameters—we have used ρ0 = 1, qmin = 0.2, and qhigh = 0.8. For STrLP we used ∆0 = 20, qmin = 0.1, and qhigh = 0.7. For both algorithms, x0 and y0 were provided by the CUTEr problem. The initial active-set prediction was calculated as A0 = A(x0 ).
5.3.2
Problem formulation
Instead of reformulating problems to the somewhat restrictive form of NLP (1.1), SRLP accepts problems with more generic bound constraints minimize x
f (x)
subject to c(x) = 0 bl ≤ x ≤ bu ,
(5.1)
where bl ∈ Rn and bu ∈ Rn are the lower and upper bounds, respectively. This generalization has little effect on the details of the algorithms, and allows us to avoid splitting problem variables. Moreover, our subproblem solvers can handle these generic bound constraints directly.
5.3.3
CUTEr problems
The problems in our test set were compiled using the SifDec 2 and CUTEr 2 packages [12]. In the compilation settings for SifDec, we used the g95 Fortran compiler, and the “large scale problems” option. Compiled problems were accessed from Matlab using the MEX interface provided with CUTEr. CUTEr presents problems in the generic format minimize x
f (x)
subject to cl ≤ c(x) ≤ cu bl ≤ x ≤ bu . We reformulate these CUTEr problems into the form accepted by SRLP (5.1) by subtracting slack variables from the constraints minimize x
f (x)
subject to c(x) − s = 0 bl ≤ x ≤ bu cl ≤ s ≤ cu .
5.3. Implementation Feasibility Optimality Max iterations Scale option
28
1e−8 1e−8 5000 0
I Table 5.1: Relevant SQOPT parameters used for solving RLP.
5.3.4
Solving RLP subproblems
RLP subproblems (4.1) are solved using SQOPT version 7, a large-scale linear or quadratic programming algorithm [16]. SQOPT is not optimized specifically for handling problems of the form of RLP, but instead solves more general convex quadratic programs. However, it has been shown in practice to be a very robust and efficient solver, and is able to take advantage of the diagonal Hessian of RLP. In our implementation, the RLP subproblems solved at each iteration have generic bounds, and can be written as minimize d
gL (x)T d + 12 ρ dT d
subject to J(x)d = −c(x) bl ≤ x + d ≤ bu . SQOPT accepts problems of the form minimize x
g T x + 12 xT H x
subject to cl ≤ A x ≤ cu bl ≤ x ≤ bu .
(5.2)
Reformulation from one form to the other is straight forward. Importantly, though, SQOPT does not accept the Hessian of our objective as a parameter. Instead, we provide a call-back function which evaluates products with this Hessian. Our call-back function for RLP evaluates Hx where H := ρ I. SQOPT requires a number of parameters. Those relevant to our implementation are listed in Table 5.1. At the time of this writing, no Matlab interface to SQOPT 7 was officially supported, and we developed a custom C MEX interface to the Fortran routines. The exact process through which SQOPT solves a quadratic program is beyond the scope of this thesis (see instead [10, Section 4] for a detailed treatment). However, as the cost of an SRLP iteration is largely dominated by solving the RLP subproblem, it is important that we consider the source of this expense. In Section 4.6 we refered to minor iterations as the iterations taken by our subproblem solver. SQOPT obtains a search direction ∆x by solving the system µ ¶µ ¶ µ ¶ H WT ∆x gq =− , where gq := g + Hxk , (5.3) y 0 W and the rows of W , the current working-set matrix, are a linearly independent subset of the constraint gradients of (5.2).
5.3. Implementation
29
The special form of H in our RLP subproblem (i.e., H = ρI) would in principal allow us to obtain ∆x via the least-squares problem minimize kW T y − gq k22 , and ∆x = y
1 (gq − W T y ? ) . ρ
An RLP algorithm could solve this least-squares problem efficiently by maintaining a factorization of W . To compute a search direction for RLP, the algorithm would update this factorization. A similar factorization update is performed at each iteration by the simplex method for linear programming. The fundamental cost of solving RLP is, thus, comparable to that of solving TrLP.
5.3.5
Solving TrLP subproblems
TrLP subproblems are handled almost identically to RLPs (see Section 5.3.4), with SQOPT used as the solver. A TrLP with generic bounds can be written as minimize d
gL (x)T d
subject to J(x)d = −c(x) bl ≤ x + d ≤ bu kdk∞ ≤ ∆ , which we can rewrite to match the form accepted by SQOPT more closely: minimize d
gL (x)T d
subject to J(x)d = −c(x) bel ≤ d ≤ beu , with bel := max (bl − x, −∆), and beu := min (bu − x, ∆). Implementation details are identical to those described in Section 5.3.4.
5.3.6
Solving MINFR
Feasibility restoration steps in SRLP and STrLP solve MINFR (4.2). With generic bound constraints, this problem can be written minimize d
kdk2
subject to c(x + d) = 0 bl ≤ x + d ≤ bu .
(5.4)
We use SNOPT version 7 [15, 10], a general purpose algorithm for constrained optimization, to solve these problems. SNOPT accepts problems of the generic form minimize x
f (x)
subject to cl ≤ c(x) ≤ cu bl ≤ x ≤ bu ,
5.4. Results Major feasibility Optimality Major iterations Minor iterations Scale option
30 1e−8 1e−8 8000 500 0
I Table 5.2: Relevant SNOPT parameters used for solving MINFR. which easily accommodates problems in the form of Equation 5.4. Relevant SNOPT parameters used in our implementation are listed in Table 5.2. The return status from SNOPT is important to our algorithms. A status of 1 or 2 indicates a successful feasibility restoration step; 11, 12, or 13 indicates that NLP may be infeasible; any other status is considered an error. SNOPT Fortran routines were called from within the Matlab environment using the MEX interface provided with the package.
5.3.7
Optimality tests
SRLP and STrLP are terminated at iteration k, and the primal-dual pair (xk , y k ) is deemed optimal, if the following conditions hold: ³ ´ £ ¤− max kc(xk )k∞ , k xk k∞ ³ £ ¤− ´ max k z k k∞ , kz k ◦ xk k∞
¡ ¢ < η 1 + kxk k2 ¡ ¢ < η 1 + ky k k2 ,
³ ´ − where z k := ∇f (xk ) − J(xk )T y k , and [a] := max 0, max (−a) . These are the optimality tests proposed by Byrd et al. [3]—rearranged to accommodate our formulation of a nonlinear program (which is different from that used by Byrd et al.). We have used the value η = 10−6 on all our experiments.
5.4
Results
Appendices B and C list the results for SRLP and STrLP, respectively, on our test set problems. For each problem, we list the number of major iterations required to solve the problem, the total number of minor iterations (SQOPT iterations) required, the number of feasibility restoration steps which had to be taken (equivalently the number of infeasible subproblems which were encountered for this problem), and the total time taken to solve this problem, among other statistics described in the appendices.
31
Chapter 6
Discussion The SL–QP method (Chapter 3) was developed in order to alleviate some of the potential inefficiencies experienced by the widely successful SQP method (Section 2.3) when problem dimensions become very large. Careful analysis of current SL–QP algorithms, however, reveals that they too contain many potential inefficiencies that may prove to be performance bottlenecks for very large-scale problems. This chapter discusses three such inefficiencies, and demonstrates how using an RLP subproblem (4.1) in place of a TrLP subproblem (3.2) could remove them.
6.1
Expensive infeasible subproblems
When an SL–QP algorithm encounters an infeasible subproblem, it must go to extra lengths to determine a new iterate. Section 3.3 discussed the two most common solutions to this problem: feasibility restoration and penalty reformulation. While there are tradeoffs between these two methods, it can be generally agreed that both have significant expenses compared with the cost of a standard iteration, especially for very large-scale problems. Clearly, an SL–QP algorithm should avoid infeasible subproblems whenever possible. RLP and TrLP subproblems share the same linearized constraints (those which also appear in Equation 3.1), and, if the system these define is degenerate, both are infeasible. This is the only way an RLP subproblem can be infeasible. Section 3.3 explains, however, that the trust-region constraint could induce an otherwise feasible TrLP to be infeasible. An SL–QP algorithm which uses RLP subproblems will, therefore, encounter fewer infeasible subproblems than one which uses TrLP. The benefit gained by avoiding these additional infeasible subproblems is, of course, completely dependent on how often they are encountered in practice. Appendix B and C list the number of infeasible subproblems handled for each problem in our test set (or, equivalently, the number of feasibility restoration steps taken) by SRLP and STrLP respectively. These data are summarized in Figure 6.1, which shows that SRLP handled vastly fewer infeasible subproblems than STrLP—9 compared with 74. Given the high cost of feasibility restoration and penalty reformulation steps as problems become very large, using an RLP subproblem instead of a TrLP subproblem could greatly increase the efficiency of an SL–QP method due to the reduced number of infeasible subproblems it encounters.
6.2. Slow active-set updates
32
160 SRLP
STrLP
Number of problems
140 120 100 80 60 40 20 0 0
1
2
3+
Infeasible subproblems
I Figure 6.1: The number of problems (out of 167) in our test set (see Section 5.2) on which the SRLP and STrLP algorithms encountered a given number of infeasible subproblems. SRLP, which solves RLP subproblems, only needed to handle 9 infeasible subproblems over this entire set, compare this with its trust-region-based counterpart, which handled 74.
6.2
Slow active-set updates
We have seen that SQP requires a (potentially very expensive) quadratic programming iteration for every variable added to, or removed from, the active set. SL–QP alleviates this cost by replacing these iterations with cheaper, linear programming ones. However, the very nature of using an active-set method to solve these subproblems (whether IQP or TrLP) requires that elements in the active set are updated one-by-one. Such slow updates could prove to be a bottleneck for current SL–QP methods as problem dimensions (and, thus, the combinatorics of the active set) grow. In this section we explore the possibility of solving SL–QP subproblems with a gradient-projection algorithm—which are able to activate many variables during each iteration. Definition 6.1. A gradient-projection algorithm generates new iterate xk+1 from its current iterate xk using £ ¤ xk+1 = β k xk − αk ∇f (xk ) Ω , where [ · ]Ω denotes projection onto the feasible set Ω, and β k , αk ∈ R. The cost of each gradient projection step is often dominated by projection onto the feasible set and, as such, they are known to be much more effective
6.3. Inefficient warm-starts
33
when Ω has a form which is particularly cheap to project onto. Neither TrLP or RLP is particularly well suited to gradient projection: projection onto their linear equality constraints is a quadratic program unto itself. Friedlander et al. [9], however, suggest that the dual problem to RLP has a form which is very well suited to gradient projection, as we now explore. A satisfactory treatment of duality theory is outside the scope of this thesis, and is not required for this discussion. (The interested reader is directed towards [13, Section 16.8]). Instead, it is sufficient to understand that duality theory for convex quadratic programming tells us that there exists a problem related to RLP (called the dual of RLP) which has identical KKT conditions to it, and that it is possible to obtain a solution for RLP (the primal problem) by finding a solution to this related problem. Theorem 6.2. The dual problem of RLP is given by NNQP(x, ρ)
1 kJ(x)T y + z − g(x)k22 + y T c − z T x y,z 2ρ subject to z ≥ 0 .
minimize
The dual of RLP, NNQP, is a convex quadratic program with non-negativity constraints. Importantly, its feasible set (Ω = {y, z | z ≥ 0}) allows for very cheap projections—simple truncations of z, with linear cost. Gradient projection is very effective on such problems. We can, thus, solve RLP subproblems (albeit indirectly) very efficiently using a gradient projection algorithm. This will allow for very fast active-set updates, and could prove to be a significant advantage of using RLP over TrLP subproblems in an SL–QP algorithm.
6.3
Inefficient warm-starts
Warm-start efficiency (see Section 2.2.1) is crucially important for the effectiveness of an SL–QP algorithm. This efficiency is dependent on how good a prediction the active set at one iteration, Ak−1 , is of that at a subsequent iteration, Ak . We measure this as warm-start quality, which we define in terms of the difference between these two sets. Definition 6.3. The warm-start quality, wk , of iteration k for an active-set method is given by |Ak−1 ∩ Ak | ω k := × 100% , (6.1) |Ak | where Ak−1 := A(xk−1 ), and Ak := A(xk ) are the corresponding active-set predictions, and |·| denotes the cardinality of a set. An SL–QP algorithm should seek to maximize wk at each iteration. We suspect, however, that the trust-region constraints in the TrLP subproblems may significantly reduce it, thus impeding the efficiency of the algorithm.
6.3. Inefficient warm-starts
34
Warm−start quality of STrLP on HIMMELBI Warm−start quality (%)
100 90 80 70 60
Full active set Natural active set
50 0
50
100
150
200
250
300
350
400
450
Iteration
I Figure 6.2: The warm-start quality, ω k , measured at each iteration for STrLP on the CUTEr problem HIMMELBI. The full active set (AF ) gives a much lower quality than the natural active set (AN ) because the trust-region constraints constraints continue to vacillate as we approach an optimal point. Why would the trust-region adversely affect warm-start quality? Let us consider the constraints in TrLP in two subsets: the natural constraints (c(x) + J(x) d = 0, x + d ≥ 0), and the trust-region constraints (d ≤ ∆). The natural constraints are a linearized model of those in NLP, and thus are related on subsequent steps. Consequently, we expect some coherence in the active-set for these constraints from iteration to iteration, translating into a higher warmstart quality. The trust-region constraints, in contrast, are wholly artificial; they do not model any part of NLP. Instead, they hover around our iterate, x, and grow or shrink suddenly whenever ∆ is updated. We expect much less coherence from these trust-region constraints, which would translate into a lower warm-start quality. In order to quantify the effect of trust-region constraints on warm-start quality, we consider a subset of the active set of TrLP which includes only the natural constraints. Definition 6.4. The natural active set of TrLP is the set of active natural constraints, and can be written n o AN := i : di = −xi . We also consider the full active set. Definition 6.5. The full active set of TrLP is the set of all active constraints in TrLP including the natural and trust-region constraints. It can be written n o n o AF = i : di = max (−xi , −∆) ∪ −i : di = ∆ . (Definitions 6.4 and 6.5 can be derived directly from Definition 1.3, which is
6.3. Inefficient warm-starts
35
Warm−start quality of SRLP vs. STrLP on HIMMELBI Warm−start quality (%)
100 90 80 70 60
STrLP SRLP
50 0
50
100
150
200
250
300
350
400
450
Iteration
I Figure 6.3: The warm-start quality at each iteration for both STrLP and SRLP algorithms on the CUTEr problem HIMMELBI. The full active (AF ) is used in both cases; the absence of trust-region constraints on SRLP’s RLP subproblems leads to a much higher warm-start quality. shown in Appendix A.) Let us first consider this effect on a sample problem from our CUTEr test set: HIMMELBI, which has 12 constraints and 100 bounded variables. We run the trust-region based solver, STrLP, on this problem, and measure the warmstart quality, ω k , at each iteration k using both AN and AF . Figure 6.2 shows the results. Notice how ω k for AF never reaches 100%, as—even once we have reached the optimal active set for NLP—the trust-region constraints continue to change iteration by iteration. Warm-start quality for AN , however, increases very quickly as the active set becomes optimal. An RLP subproblem includes only the natural constraints (we may use AN as the active set for all its constraints), and so we expect the warm-start quality of the SRLP algorithm to be much higher than that of STrLP. We again look at HIMMELBI, and measure ω k for the full active set of SRLP (which is equivalent to AN ). Figure 6.3 compares the warm-start quality of the two algorithms, and we indeed see a significant advantage for SRLP. (Note, though, that these solvers also required a different number of major iterations to solve HIMMELBI.) This result is typical for our test set. The average warm-start quality over all iterations for each problem is given in Appendices B and C. These data are represented in Figure 6.4, which shows a cumulative distribution function of the number of problems with given warm-start qualities. This plot clearly shows that, on our test set, the warm-start quality was indeed significantly higher when using RLP subproblems instead of TrLP. A higher warm-start quality should result in less work required to solve each subproblem. In our implementations, which use SQOPT as the subproblem solver, this translates into fewer minor iterations required per major iteration. Figure 6.5 shows the number of minor iterations required for each iteration—for
6.3. Inefficient warm-starts
36
Average warm−start quality of SRLP vs. STrLP 1 0.9
SRLP STrLP
0.8
Problem CDF
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 20
30
40
50
60
70
80
90
100
Average warm−start quality (%)
I Figure 6.4: A cumulative distribution function of the average warm-start quality (over all iterations) for all problems in our test set. The results for both SRLP and STrLP are shown—we see a significantly higher fraction of problems with high average quality in SRLP due to the absence of trust-region constraints. both SRLP and STrLP—on HIMMELBI, and demonstrates a large reduction in subproblem cost for SRLP due to increased warm-start efficiency. Figure 6.6 shows a cumulative distribution function of the average number of minor iterations per major iteration for each problem in our test set. We see here that the major iterations for SRLP were indeed cheaper on average than those of STrLP. It seems, then, that a significant advantage can be gained due to an increased warm-start efficiency by replacing the TrLP subproblems with RLPs in an SL–QP algorithm.
6.3. Inefficient warm-starts
37
Cost per iteration of SRLP vs. STrLP on HIMMELBI 60 SRLP
Minor iterations
50 STrLP 40 30 20 10 0 0
50
100
150
200
250
300
350
400
450
Iteration
I Figure 6.5: The number of minor (SQOPT) iterations taken by both SRLP and STrLP on each iteration while solving HIMMELBI. Due to a much higher warm-start quality, SRLP takes advantage of significantly cheaper iterations than STrLP on this problem.
Average cost per iteration of SRLP vs. STrLP 1 0.9
SRLP STrLP
Problem CDF
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −2 10
−1
10
0
10
1
10
2
10
3
10
Minor iterations per major iteration
I Figure 6.6: A cumulative distribution function of the average number of minor (SQOPT) iterations taken per major iteration for all problems in our test set. SRLP and STrLP results are shown. Major iterations for SRLP were cheaper over this test set, presumably due to an increased warm-start quality.
38
Chapter 7
Conclusions In this thesis, we explored the effectiveness of a proposed variant of current SL– QP methods at solving very large-scale nonlinear programs. Instead of solving a TrLP subproblem (Equation 3.2) at each iteration to update a prediction of the optimal active set, this new method solves an RLP subproblem (Equation 4.1). Two algorithms were presented to facilitate a comparison between these subproblem choices: SRLP—which successively takes steps generated by solving an RLP subproblem (Algorithm 4.3), along with STrLP—a benchmarking algorithm which takes TrLP steps (Algorithm 5.1). Experiments using a diverse set of problems selected from the CUTEr set (Section 5.2) demonstrated that the trust-region constraints in TrLP subproblems can lead to significant inefficiencies. These constraints often greatly reduced the effectiveness of warm-starting successive subproblems solves, which lead to more expensive subproblem solves (Section 6.3). They were also shown to induce a significant number of infeasible subproblems, which lead the very expensive iterations (Section 6.1). In contrast, it was shown that using an RLP subproblem instead of a TrLP largely eliminates these inefficiencies. The RLP subproblem contains no artificial constraints and, as such, enjoys a much higher average warm-start efficiency on subsequent solves—and thus cheaper subproblem solves—than TrLP subproblems. Also, regularization cannot induce infeasibility, and thus SRLP encountered far fewer infeasible subproblems than STrLP (Section 6.1). Additionally, we showed that an RLP subproblem has another potential advantage over a TrLP: its dual problem is a bound-constrained least-squares problem (Section 6.2). This form is very conducive to gradient-projection-type solvers, which could prove significantly more effective than active-set subproblem solvers for certain large-scale problems. An RLP-based algorithm could greatly benefit from the ability to choose a gradient-projection-type subproblem solver. Based on these results, it seems likely that our proposed variant of SL–QP will prove effective on very large-scale nonlinear programs. It is our intention that the SRLP algorithm presented here will serve as a base for such an algorithm, with an EQP step added to incorporate knowledge of second-order information of NLP. Some important considerations must be made when adding an EQP step, however: (i) a strategy for controlling the length of generated EQP steps, (ii) the selection of a non-degenerate subset of the active-set to ensure that EQP is not inconsistent, and (iii) the choice of a trial step which combines the LP and EQP steps. These issues are discussed in detail by Byrd et al. [3] and
Chapter 7. Conclusions Waltz [19].
39
40
Bibliography [1] Paul T. Boggs and Jon W. Tolle. Sequential quadratic programming for large-scale nonlinear optimization. Journal of Computational and Applied Mathematics, 124(1-2):123 – 137, 2000. [2] Brian Borchers and John E. Mitchell. An improved branch and bound algorithm for mixed integer nonlinear programs. Computers and Operations Research, 21(4):359 – 367, 1994. [3] Richard H. Byrd, Nicholas I. M. Gould, Jorge Nocedal, and Richard A. Waltz. An algorithm for nonlinear optimization using linear programming and equality constrained subproblems. Math. Program., 100(1):27–48, 2004. [4] Richard H. Byrd, Jorge Nocedal, and Richard A. Waltz. Steering exact penalty methods for nonlinear programming. Optimization Methods Software, 23(2):197–213, 2008. [5] Choong Ming Chin and Roger Fletcher. On the Global Convergence of an SLP-Filter Algorithm that takes EQP steps. Mathematical Programming, 96:161–177, 2003. [6] Anthony V. Fiacco and Garth P. McCormick. The sequential unconstrained minimization technique for nonlinear programing, a primal-dual method. Management Science, 10(2):360–366, 1964. [7] R. Fletcher and E. Sainz de la Maza. Nonlinear programming and nonsmooth optimization by successive linear programming. Math. Program., 43(3):235–256, 1989. [8] Roger Fletcher and Sven Leyffer. Nonlinear programming without a penalty function. Mathematical Programming, 91:239–269, 2000. [9] Michael P. Friedlander, Nick I. M Gould, Sven Leyffer, and Todd S. Munson. A filter active-set trust-region method. ANL/MCS-P1456-0907, September 2007. [10] P. E. Gill, W. Murray, and M. A. Saunders. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM Review, 47:99–131, January 2005.
41 [11] Jacek Gondzio and Andreas Grothey. A new unblocking technique to warmstart interior point methods based on sensitivity analysis. SIAM Journal on Optimization, 19(3):1184–1210, 2008. [12] Nicholas I. M. Gould, Dominique Orban, and Philippe L. Toint. CUTEr and SifDec: A constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw., 29(4):373–394, 2003. [13] Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, New York, NY, 1999. [14] F. Palacios-Gomez, L. Lasdon, and M. Engquist. Nonlinear optimization by successive linear programming. Management Science, 28(10):1106–1120, 1982. [15] Walter Murray Phillip E. Gill and Michael A. Saunders. User’s Guide for SNOPT Version 7: Software for Large-Scale Nonlinear Programming. http://www.stanford.edu/group/SOL/snopt.htm, February 2008. [16] Walter Murray Phillip E. Gill and Michael A. Saunders. User’s Guide for SQOPT Version 7: Software for Large-Scale Linear and Quadratic Programming. http://www.stanford.edu/group/SOL/sqopt.htm, June 2008. [17] Stephen M. Robinson. Perturbed Kuhn-Tucker points and rates of convergence for a class of nonlinear-programming algorithms. 7:1–16, 1974. [18] Andreas Wachter and Lorenz T. Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program., 106(1):25–57, 2006. [19] Richard A. Waltz. Algorithms for large-scale nonlinear optimization. http://www.eecs.northwestern.edu/∼rwaltz/articles.html, June 2002.
42
Appendix A
Derivation of the active sets for TrLP subproblems We define an active set for NLP in Definition 1.3. TrLP is a specific case of NLP, and so this definition still holds, however it requires us to reformulate it into the format given by Equation 1.1. Instead we redefine the active set for TrLP. Section 6.3 defines two subsets of constraints, and accordingly two active sets. In this appendix we derive these definitions from Definition 1.3. The natural bound constraints of TrLP, rewritten to match NLP, give d ≥ −x. We can thus simply write n o AN := i : di = −xi . The trust-region constraints make the active set slightly more cumbersome. The bound constraints, together with these trust-region constraints for TrLP can be written succinctly as ¡ ¢ max −xi , −∆ ≤ di ≤ ∆ . Because a variable that is active at its lower bound should not be considered equivalent to one at its upper bound, we should begin by defining two separate active sets for TrLP n o AF L := i : di = max (−xi , −∆) , and n o AF U := i : di = ∆ . To simplify our notation, and to remain consistent with our earlier definition of an active set, however, we can combine these active sets by inserting variables at their upper bound with a negative index (in this way the definition of ω in Equation 6.1 can be used). Our definition of the full active set for TrLP is, thus, ¡ © ª¢ AF := AF L ∪ − AF U , © ª © ª where − ζ := −a | a ∈ ζ .
43
Appendix B
SRLP Results This appendix lists the results of the algorithm SRLP on each problem in the test set defined in Section 5.2. problem n m maj min fr time ω f?
name of CUTEr problem number of variables number of constraints number of major iterations taken number of minor iterations taken number of feasibility restoration steps taken time taken to solve in seconds measure of subproblem warm-start quality (Equation 6.1) value of the objective at solution
I Table B.1: Definition of symbols used in Table B.2. problem alsotame bt13 cantilvr cb2 cb3 chaconn1 chaconn2 concon congigmz csfi1 csfi2 dembo7 demymalo dnieper eqc expfita gigomez1 gigomez2 gigomez3 haifas
n 2 5 5 3 3 3 3 15 3 5 5 16 3 61 9 5 3 3 3 13
m 1 1 1 3 3 3 3 11 5 4 4 20 3 24 3 22 3 3 3 9
maj 7 57 17 11 6 12 6 4 65 48 56 1464 10 9 2 34 9 14 6 35
min 7 62 18 13 5 10 1 13 79 59 144 1583 13 269 5 78 11 16 5 54
fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
time 0.257 0.304 0.166 0.199 0.146 0.150 0.142 0.174 0.290 0.241 0.285 2.586 0.182 0.211 0.174 0.225 0.181 0.155 0.144 0.223
ω 95.238 99.573 100.000 96.296 97.222 98.611 100.000 99.038 98.566 98.551 84.127 99.837 94.444 80.196 91.667 98.810 94.444 97.436 97.222 97.947
f? 8.21e-02 00 1.34e+00 1.95e+00 2.00e+00 1.95e+00 2.00e+00 -6.23e+03 2.80e+01 -4.91e+01 5.50e+01 1.75e+02 -3.00e+00 1.87e+04 -8.30e+02 1.14e-03 -3.00e+00 1.95e+00 2.00e+00 -4.50e-01 ...
Appendix B. SRLP Results problem haldmads himmelbk himmelp4 himmelp5 himmelp6 hong hs10 hs102 hs103 hs104 hs106 hs107 hs108 hs11 hs111 hs112 hs114 hs116 hs117 hs119 hs12 hs13 hs14 hs15 hs16 hs17 hs18 hs19 hs20 hs22 hs23 hs24 hs29 hs30 hs31 hs32 hs33 hs34 hs36 hs37 hs41 hs43 hs54 hs55
n 6 24 2 2 2 4 2 7 7 8 8 9 9 2 10 10 10 13 15 16 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 6 6
m 42 14 3 3 5 1 1 5 5 5 6 6 13 1 3 3 11 14 5 8 1 1 2 2 2 2 2 2 3 2 5 3 1 1 1 2 2 2 1 2 1 3 1 6
maj 12 50 283 28 110 26 16 956 284 94 21 20 25 17 20 663 707 46 351 11 180 14 5 17 4 11 103 5 3 4 20 8 248 15 19 14 9 13 2 9 6 89 2 2
min 166 121 298 34 123 61 17 3571 1106 105 40 74 60 18 29 794 747 86 571 56 184 3 2 1 0 7 104 3 4 4 22 13 252 7 21 18 7 15 4 13 8 102 4 5
fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
time 0.193 0.250 0.614 0.180 0.341 0.209 0.191 3.060 0.835 0.318 0.209 0.200 0.210 0.200 0.200 1.236 1.263 0.250 0.786 0.202 0.567 0.154 0.177 0.175 0.140 0.153 0.289 0.140 0.176 0.194 0.199 0.178 0.557 0.189 0.161 0.188 0.149 0.186 0.170 0.182 0.178 0.313 0.177 0.173
44 ω 88.447 97.530 98.660 93.077 98.393 94.667 100.000 95.359 94.674 99.827 97.768 95.556 97.727 100.000 100.000 99.439 99.852 98.589 99.380 86.806 99.708 95.238 100.000 95.000 93.750 78.571 99.500 85.000 86.667 87.500 99.248 95.000 99.860 98.214 96.667 98.571 97.778 93.846 62.500 90.000 90.000 99.532 100.000 91.667
f? 1.22e-04 5.18e-02 -5.90e+01 -5.90e+01 -5.90e+01 2.26e+01 -1.00e+00 9.12e+02 5.44e+02 3.95e+00 9.53e+03 5.06e+03 -5.00e-01 -8.50e+00 -4.78e+01 -4.78e+01 -1.76e+03 1.85e+02 3.23e+01 2.45e+02 -3.00e+01 1.01e+00 1.39e+00 3.06e+02 2.31e+01 01 5.00e+00 -6.96e+03 4.02e+01 1.00e+00 2.00e+00 -1.00e+00 -2.26e+01 1.00e+00 6.00e+00 1.00e+00 -4.00e+00 -8.34e-01 -3300 -3.46e+03 1.93e+00 -4.40e+01 -7.22e-34 6.67e+00 ...
Appendix B. SRLP Results problem hs57 hs59 hs60 hs62 hs64 hs65 hs66 hs68 hs71 hs72 hs73 hs74 hs75 hs80 hs81 hs83 hs84 hs85 hs86 hs88 hs89 hs90 hs91 hs92 hs93 hs95 hs96 hs97 hs98 hs99 hubfit kiwcresc loadbal lsnnodoc madsen makela1 makela2 matrix2 mconcon mifflin1 mifflin2 minmaxrb mistake mribasis
n 2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 2 3 4 5 6 6 6 6 6 6 7 2 3 31 5 3 3 3 6 15 3 3 3 9 36
m 1 3 1 1 1 1 2 2 2 2 3 5 5 3 3 3 3 21 10 1 1 1 1 1 2 4 4 4 4 2 1 2 31 4 6 2 3 2 11 2 2 4 13 55
maj 8 88 18 163 44 129 15 363 14 8 7 20 10 8 7 3 69 364 36 192 204 448 407 44 1056 8 8 10 10 123 10 12 37 2 22 6 35 18 4 8 11 27 22 107
min 10 133 20 181 71 155 19 375 18 27 11 28 11 12 7 5 116 397 81 195 208 453 413 51 1074 20 13 14 13 247 13 17 80 5 36 9 40 27 12 14 15 30 57 239
fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
time 0.187 0.376 0.197 0.389 0.207 0.339 0.189 0.712 0.189 0.145 0.181 0.200 0.172 0.180 0.143 0.182 0.284 0.758 0.230 0.794 0.853 1.994 2.032 0.366 1.783 0.181 0.146 0.151 0.151 0.359 0.196 0.196 0.233 0.171 0.201 0.176 0.223 0.200 0.182 0.185 0.187 0.210 0.208 0.377
45 ω 100.000 91.786 100.000 100.000 98.333 98.101 96.000 99.898 95.000 100.000 89.796 98.889 96.296 100.000 100.000 83.333 97.750 99.947 96.410 100.000 100.000 100.000 100.000 100.000 99.973 91.250 91.250 92.000 92.000 100.000 92.593 95.556 99.478 83.333 96.732 93.333 99.048 97.500 99.038 93.333 95.556 98.901 96.717 99.486
f? 3.06e-02 -7.80e+00 3.26e-02 -2.63e+04 6.30e+03 9.54e-01 5.18e-01 -9.20e-01 1.70e+01 5.47e+02 2.99e+01 5.13e+03 5.17e+03 5.39e-02 5.39e-02 -3.07e+04 -5.28e+06 -1.68e+00 -3.23e+01 1.36e+00 1.36e+00 1.36e+00 1.36e+00 1.29e+00 1.35e+02 1.56e-02 1.56e-02 4.07e+00 4.07e+00 -8.31e+08 1.69e-02 -4.89e-08 4.53e-01 1.23e+02 6.16e-01 -1.41e+00 7.20e+00 6.16e-07 -6.23e+03 -1.00e+00 -1.00e+00 4.51e-16 -1.00e+00 1.82e+01 ...
Appendix B. SRLP Results problem odfits optcntrl pentagon polak1 polak4 polak5 prodpl0 prodpl1 qc qcnew rk23 s365 snake stancmin synthes1 synthes2 synthes3 tenbars1 tenbars2 tenbars3 truspyr1 truspyr2 try-b twobars water womflet zecevic3 zecevic4 zy2 airport batch britgas core1 core2 dallasm eigmaxa eigmaxb eigmina eigminb expfitb expfitc feedloc grouping haifam
n 10 32 6 3 3 3 60 60 9 9 17 7 2 3 6 11 17 18 18 18 11 11 2 2 31 3 2 2 3 84 48 450 65 157 196 101 101 101 101 5 5 90 100 99
m 6 20 15 2 3 2 29 29 4 3 11 5 2 2 6 14 23 9 8 8 4 11 1 2 10 3 2 2 2 42 73 360 59 134 151 101 101 101 101 102 502 259 125 150
maj 22 3 13 15 4 3 11 10 2 2 7 3 4 2 19 17 35 57 87 76 1854 9 7 15 155 23 22 33 9 72 27 12 19 82 453 2 7 2 7 26 36 152 1 67
min 32 32 35 17 6 5 184 84 11 5 22 18 5 4 30 87 147 82 110 92 1895 30 1 17 197 36 40 45 10 1086 386 1063 112 365 1391 100 101 101 101 160 718 12375 50 355
fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0
time 0.206 0.184 0.199 0.192 0.181 0.175 0.199 0.152 0.171 0.178 0.180 0.176 0.217 0.173 0.204 0.202 0.239 0.469 0.469 0.516 3.238 0.240 0.179 0.212 0.430 0.203 0.219 0.189 0.182 0.462 0.236 0.328 0.202 0.370 1.667 0.698 0.187 0.693 0.186 0.227 0.433 1.924 0.179 0.465
46 ω 100.000 85.897 98.901 100.000 91.667 100.000 85.815 92.759 73.077 91.667 97.024 77.778 93.750 90.000 95.833 88.727 95.096 99.581 99.768 99.737 99.982 95.960 100.000 98.077 99.179 96.667 92.857 97.115 93.333 94.520 91.804 98.066 98.005 99.444 99.857 97.525 99.929 97.525 99.929 99.439 99.750 96.584 100.000 99.743
f? -2.38e+03 5.50e+02 1.56e-04 2.72e+00 -2.66e-13 5.00e+01 5.88e+01 3.57e+01 -9.57e+02 -8.07e+02 8.33e-02 00 2.32e-14 4.25e+00 7.59e-01 -5.54e-01 1.51e+01 2.30e+03 2.30e+03 2.25e+03 1.12e+01 1.12e+01 1.00e+00 1.51e+00 1.05e+04 4.57e-08 9.73e+01 7.56e+00 2.00e+00 4.80e+04 2.59e+05 00 9.11e+01 7.29e+01 -4.82e+04 -01 -9.67e-04 1.00e+00 9.67e-04 5.02e-03 2.33e-02 6.39e-19 1.39e+01 -4.50e+01 ...
Appendix B. SRLP Results problem himmelbi hydroels lakes leaknet reading6 robot ssebnln tfi1 tfi3 twirism1 yorknet zamb2-10 zamb2-11 zamb2-8 zamb2-9
n 100 169 90 156 102 14 194 3 3 343 312 270 270 138 138
m 12 168 78 153 50 2 96 101 101 313 256 96 96 48 48
maj 469 21 14 1 111 2 10 467 14 67 16 32 140 24 126
min 773 95 147 157 376 8 342 596 178 602 532 896 1129 285 722
fr 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0
time 1.176 0.216 0.823 0.178 0.428 0.269 0.206 1.047 0.202 0.751 0.263 0.334 0.656 0.238 0.494
I Table B.2: SRLP Results
47 ω 99.505 99.407 99.628 100.000 99.485 81.250 92.931 99.705 98.077 99.665 96.420 95.807 98.815 96.006 98.183
f? -1.74e+03 -3.47e+06 3.51e+05 1.53e+01 -1.45e+02 5.46e+00 16170600 5.33e+00 4.30e+00 -1.01e+00 2.44e+04 -1.57e+00 -1.11e+00 -1.52e-01 -3.50e-01
48
Appendix C
STrLP Results This appendix lists the results of the algorithm STrLP on each problem in the test set defined in Section 5.2. problem n m maj min fr time ω f?
name of CUTEr problem number of variables number of constraints number of major iterations taken number of minor iterations taken number of feasibility restoration steps taken time taken to solve in seconds measure of subproblem warm-start quality (Equation 6.1) value of the objective at solution
I Table C.1: Definition of symbols used in Table C.2. problem alsotame bt13 cantilvr cb2 cb3 chaconn1 chaconn2 concon congigmz csfi1 csfi2 dembo7 demymalo dnieper eqc expfita gigomez1 gigomez2 gigomez3 haifas
n 2 5 5 3 3 3 3 15 3 5 5 16 3 61 9 5 3 3 3 13
m 1 1 1 3 3 3 3 11 5 4 4 20 3 24 3 22 3 3 3 9
maj 4 25 28 38 6 38 6 2 16 53 2 2 9 8 1 31 11 38 6 49
min 2 23 125 39 1 35 0 11 22 139 3 39 10 222 5 127 18 36 1 60
fr 0 2 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 0 0
time 0.401 1.989 0.227 0.251 0.142 0.195 0.144 0.241 0.235 0.316 0.213 0.280 0.185 0.382 0.181 0.226 0.250 0.194 0.146 0.249
ω 91.667 77.778 33.333 84.848 97.222 85.000 100.000 78.846 86.250 72.629 77.778 77.778 89.583 83.294 83.333 87.037 83.333 84.848 97.222 93.007
f? 8.21e-02 00 1.34e+00 1.95e+00 2.00e+00 1.95e+00 2.00e+00 -6.23e+03 2.80e+01 -4.91e+01 5.50e+01 2.10e+02 -3.00e+00 1.87e+04 -8.30e+02 1.14e-03 -3.00e+00 1.95e+00 2.00e+00 -4.50e-01 ...
Appendix C. STrLP Results problem haldmads himmelbk himmelp4 himmelp5 himmelp6 hong hs10 hs102 hs103 hs104 hs106 hs107 hs108 hs11 hs111 hs112 hs114 hs116 hs117 hs119 hs12 hs13 hs14 hs15 hs16 hs17 hs18 hs19 hs20 hs22 hs23 hs24 hs29 hs30 hs31 hs32 hs33 hs34 hs36 hs37 hs41 hs43 hs54 hs55
n 6 24 2 2 2 4 2 7 7 8 8 9 9 2 10 10 10 13 15 16 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 6 6
m 42 14 3 3 5 1 1 5 5 5 6 6 13 1 3 3 11 14 5 8 1 1 2 2 2 2 2 2 3 2 5 3 1 1 1 2 2 2 1 2 1 3 1 6
maj 21 27 624 15 317 43 35 28 24 131 21 38 44 27 29 353 679 5 550 26 80 14 8 7 4 14 51 6 3 8 15 2 542 29 37 3 4 15 2 39 22 174 2 1
min 218 105 1242 18 630 144 37 47 41 544 60 65 265 28 223 2731 925 42 2926 126 133 2 8 1 0 6 52 4 4 10 16 5 1757 35 74 6 1 27 5 86 44 507 4 5
fr 1 0 0 1 2 0 1 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 2 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
time 0.342 0.217 1.107 0.190 0.716 0.237 0.276 1.118 1.892 0.422 0.203 0.226 0.309 0.236 0.217 0.739 1.186 0.270 1.062 0.212 0.283 0.176 0.192 0.174 0.151 0.152 0.257 0.173 0.183 0.180 0.220 0.170 0.996 0.209 0.188 0.174 0.140 0.190 0.170 0.225 0.207 0.431 0.172 0.171
49 ω 86.553 90.486 60.285 66.154 71.772 20.833 66.667 85.714 86.111 69.161 80.357 91.111 79.273 66.667 49.744 42.388 94.260 74.074 75.899 82.083 41.026 90.476 87.500 75.000 93.750 75.000 73.333 75.000 86.667 81.250 90.909 80.000 25.635 73.077 45.833 73.333 95.000 77.778 50.000 59.130 60.000 61.238 78.571 83.333
f? 1.22e-04 5.18e-02 -5.90e+01 -5.90e+01 -5.90e+01 2.26e+01 -1.00e+00 1.28e+03 9.24e+02 3.95e+00 9.14e+03 5.06e+03 -6.75e-01 -8.50e+00 -4.78e+01 -4.78e+01 -1.77e+03 190 3.23e+01 2.45e+02 -3.00e+01 1.01e+00 1.39e+00 3.06e+02 2.31e+01 01 5.00e+00 -6.96e+03 4.02e+01 1.00e+00 2.00e+00 -1.00e+00 -2.26e+01 1.00e+00 6.00e+00 01 -4.00e+00 -8.34e-01 -3300 -3.46e+03 1.93e+00 -4.40e+01 -1.43e-88 6.67e+00 ...
Appendix C. STrLP Results problem hs57 hs59 hs60 hs62 hs64 hs65 hs66 hs68 hs71 hs72 hs73 hs74 hs75 hs80 hs81 hs83 hs84 hs85 hs86 hs88 hs89 hs90 hs91 hs92 hs93 hs95 hs96 hs97 hs98 hs99 hubfit kiwcresc loadbal lsnnodoc madsen makela1 makela2 matrix2 mconcon mifflin1 mifflin2 minmaxrb mistake mribasis
n 2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 2 3 4 5 6 6 6 6 6 6 7 2 3 31 5 3 3 3 6 15 3 3 3 9 36
m 1 3 1 1 1 1 2 2 2 2 3 5 5 3 3 3 3 21 10 1 1 1 1 1 2 4 4 4 4 2 1 2 31 4 6 2 3 2 11 2 2 4 13 55
maj 39 54 37 168 7 773 30 376 37 7 3 35 5 26 30 3 9 8 42 27 51 94 70 136 613 1 1 4 4 85 31 34 35 2 38 44 29 23 2 35 78 23 42 58
min 55 73 65 406 22 2330 46 744 40 14 7 39 3 49 63 5 45 55 119 33 112 316 263 697 2470 11 0 10 12 449 47 43 499 5 57 53 33 29 11 47 160 39 335 544
fr 0 2 0 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 2 0 0 1
time 0.222 0.310 0.228 0.381 0.196 1.357 0.209 0.741 0.226 0.206 0.210 0.266 0.188 0.237 0.191 0.178 0.183 0.186 0.237 0.294 0.457 0.422 0.658 0.589 1.135 0.170 0.137 0.156 0.171 0.313 0.253 0.309 0.241 0.171 0.225 0.228 0.210 0.295 0.242 0.218 0.373 0.201 0.240 0.505
50 ω 33.333 68.387 51.389 50.000 30.000 25.435 76.000 66.810 82.500 61.111 71.429 88.889 91.111 76.250 75.962 83.333 50.000 84.615 80.303 66.667 50.000 39.231 33.696 31.905 50.050 90.000 90.000 70.000 70.000 48.457 36.111 76.667 78.802 83.333 86.772 76.800 82.540 78.125 78.846 76.842 61.333 74.107 75.413 93.626
f? 3.06e-02 -7.80e+00 3.26e-02 -2.63e+04 6.65e+03 9.54e-01 5.18e-01 -9.20e-01 1.70e+01 7.28e+02 2.99e+01 5.13e+03 5.17e+03 5.39e-02 5.39e-02 -3.07e+04 -2.38e+06 -1.25e+00 -3.23e+01 1.36e+00 1.36e+00 1.36e+00 1.36e+00 1.36e+00 1.35e+02 1.56e-02 1.56e-02 3.14e+00 3.14e+00 -8.31e+08 1.69e-02 -4.47e-08 4.53e-01 1.23e+02 6.16e-01 -1.41e+00 7.20e+00 1.03e-01 -6.23e+03 -1.00e+00 -1.00e+00 00 -1.00e+00 3.15e+01 ...
Appendix C. STrLP Results problem odfits optcntrl pentagon polak1 polak4 polak5 prodpl0 prodpl1 qc qcnew rk23 s365 snake stancmin synthes1 synthes2 synthes3 tenbars1 tenbars2 tenbars3 truspyr1 truspyr2 try-b twobars water womflet zecevic3 zecevic4 zy2 airport batch britgas core1 core2 dallasm eigmaxa eigmaxb eigmina eigminb expfitb expfitc feedloc grouping haifam
n 10 32 6 3 3 3 60 60 9 9 17 7 2 3 6 11 17 18 18 18 11 11 2 2 31 3 2 2 3 84 48 450 65 157 196 101 101 101 101 5 5 90 100 99
m 6 20 15 2 3 2 29 29 4 3 11 5 2 2 6 14 23 9 8 8 4 11 1 2 10 3 2 2 2 42 73 360 59 134 151 101 101 101 101 102 502 259 125 150
maj 32 4 15 23 8 23 8 9 1 1 60 2 2 2 37 10 50 104 116 63 303 2 6 35 17 25 54 44 4 192 43 14 2 2 55 2 7 2 7 29 36 2 1 11
min 148 49 48 25 18 22 71 109 11 5 112 7 3 3 66 61 272 553 606 396 2319 19 1 36 126 35 84 77 5 8015 380 1742 56 133 3570 100 101 101 101 381 1701 601 50 293
fr 1 1 0 0 1 1 0 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 0
time 0.336 0.287 0.195 0.201 0.222 0.242 0.182 0.151 0.174 0.181 0.334 0.171 0.169 0.195 0.221 0.260 0.357 0.522 0.545 0.503 0.800 0.244 0.181 0.245 0.292 0.238 0.289 0.207 0.179 1.236 0.250 5.907 0.497 2.117 0.491 0.701 0.185 0.694 0.186 0.242 0.497 1.349 0.176 0.212
51 ω 47.443 80.769 80.000 80.000 70.000 70.000 93.399 91.386 46.154 83.333 93.304 83.333 87.500 70.000 87.281 76.000 87.891 82.463 82.284 78.114 53.866 81.818 100.000 73.529 80.976 85.294 63.281 57.292 85.000 66.555 86.860 95.150 86.290 88.660 71.025 97.525 99.929 97.525 99.929 95.736 98.981 78.940 100.000 96.185
f? -2.38e+03 5.50e+02 1.37e-04 2.72e+00 2.24e+01 5.00e+01 5.88e+01 3.57e+01 -9.57e+02 -8.07e+02 8.33e-02 00 1.20e-13 4.25e+00 7.59e-01 -5.54e-01 1.51e+01 2.30e+03 2.30e+03 2.25e+03 1.17e+01 1.12e+01 1.00e+00 1.51e+00 1.07e+04 00 9.73e+01 7.56e+00 2.00e+00 4.80e+04 2.59e+05 00 9.11e+01 7.29e+01 -4.82e+04 -01 -9.67e-04 1.00e+00 9.67e-04 5.02e-03 2.33e-02 00 1.39e+01 9.98e+01 ...
Appendix C. STrLP Results problem himmelbi hydroels lakes leaknet reading6 robot ssebnln tfi1 tfi3 twirism1 yorknet zamb2-10 zamb2-11 zamb2-8 zamb2-9
n 100 169 90 156 102 14 194 3 3 343 312 270 270 138 138
m 12 168 78 153 50 2 96 101 101 313 256 96 96 48 48
maj 447 1 2 1 51 2 3 53 9 53 22 22 19 40 26
min 11820 30 107 158 1231 8 392 647 478 2891 763 1099 1632 1537 1061
fr 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0
time 3.763 0.178 0.821 0.179 0.752 0.272 1.028 0.316 0.208 0.847 2.341 0.268 0.257 0.303 0.230
I Table C.2: STrLP Results
52 ω 75.831 98.220 80.952 99.029 89.686 81.250 58.391 94.373 98.077 94.233 93.486 89.344 81.379 82.484 84.097
f? -1.74e+03 -3.43e+06 2.50e+11 8.98e+00 -1.45e+02 5.46e+00 2.06e+07 5.33e+00 4.30e+00 -1.01e+00 1.45e+04 -1.58e+00 -1.12e+00 -1.53e-01 -3.54e-01