October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1–17
A deficient-basis dual counterpart of Paparrizos, Samaras and Stephanides’ primal-dual simplex-type algorithm† ´ Pablo Guerrero-Garc´ıa‡ and Angel Santos-Palomo Dept. Applied Mathematics, University of M´ alaga, 29071 M´ alaga, Spain {pablito,santos}@ctima.uma.es (Received 30 September 2007; in final form 30 September 2007) In a recent primal-dual simplex-type algorithm [K. Paparrizos, N. Samaras and G. Stephanides, 2003, A new efficient primal dual simplex algorithm. Computers & Operations Research 30, 1383–1399], their authors show how to take advantage of the knowledge of a primal feasible point and they work with a square basis during the whole process. In this paper we address what could be thought of as its deficient-basis dual counterpart by showing how to take advantage of the knowledge of a dual feasible point in a deficient-basis simplex-type environment. Three small examples are given to illustrate how the specific pivoting rules designed for the proposed algorithm deal with unbounded dual objectives, non-unique dual solutions and a classical exponential example by Goldfarb, thus avoiding some caveats of the dual simplex method. No computational results are reported, but we sketch some details of a sparse projected-gradient implementation in terms of Davis and Hager’s cholmod sparse Cholesky factorization (which is row updatable/downdatable) to solve the underlying least-squares subproblems, namely linear least-squares problems and projections onto linear manifolds. Keywords: linear programming; deficient-basis simplex-type method; sparse Cholesky factorization AMS Subject Classification: 90C05 ; 65F05 ; 65F50
1
Introduction
In the last decade there has been an increasing interest in the so-called deficient-basis simplex-type methods for linear programming [1–9], in which the basis matrix is not restricted to be a square matrix. None of these papers shows how to take advantage of the a priori knowledge of some information about the feasible region. However, this situation is often encountered in practice. As pointed out by Murty [10, §6.8], in single commodity transportation † Technical Report MA-07-01 (http://www.satd.uma.es/matap/investig.htm), Dept. Applied Mathematics, Univ. M´ alaga, September 2007. Talk presented at Joint 2nd Conference on Optimization Methods & Software and 6th EUROPT Workshop on Advances in Continuous Optimization, Prague, Czech Republic, July 2007. ‡ Corresponding author.
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
2
and network flow problems, finding an initial dual feasible point—which need not to be basic—turns out to be a trivial task. On the other hand, Paparrizos, Samaras and Stephanides [11] have recently shown how to take advantage of the knowledge of a primal feasible point in a recent primal-dual simplex-type algorithm, but they work with a square basis during the whole process. In this paper we combine the best of both worlds by addressing what could be thought of as its deficient-basis dual counterpart, namely we take advantage of the knowledge of a dual feasible point in a deficient-basis simplex-type environment. This paper is organized as follows. After introducing some deficient-basis terminology, in §2 the proposed algorithm is described by performing two modifications to the deficient-basis primal simplex method. Three small examples are given in §3 to illustrate how the specific pivoting rules designed for the proposed algorithm deal with unbounded dual objectives, non-unique dual solutions and a classical exponential example by Goldfarb [12], thus avoiding some caveats of the dual simplex method. No computational results are reported, but in §4 we sketch some details of a sparse projected-gradient implementation in terms of Davis and Hager’s cholmod sparse Cholesky factorization (which is row updatable/downdatable, cf. [13]) to solve the underlying least-squares subproblems, namely linear least-squares problems and projections onto linear manifolds. Some concluding remarks and future works are included in §5.
2
The proposed algorithm
Let us consider the standard form of a primal linear program as described in classical Operations Research textbooks like e.g. [14, §4.2], (P )
max L(x) , cT x , x ∈ Rn s.t. Ax = b , x≥O
where A ∈ Rm×n with m ≤ n and rank(A) = m. Its dual program is (D)
min `(y) , bT y , y ∈ Rm s.t. AT y ≥ c
≡
min `(y) , bT y , r ∈ Rn s.t. AT y − r = c , r ≥ O
where we use r to denote the vector of dual slack variables. The ith column of A will be denoted by ai , and the feasible region of (D) and (P ) will be denoted by F and G, respectively: F , {y ∈ Rm : AT y ≥ c},
G , {x ∈ Rn : Ax = b, x ≥ O}.
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
3
In deficient-basis simplex-type methods the basis matrix is a full column-rank matrix that is not restricted to be square. By using B (k) and N (k) , [1 : n]\B (k) to denote the basic and non-basic variable sets respectively, the corresponding basis and non-basis matrices are defined (in Matlab notation) as Bk , A(:, B (k) ) ∈ Rm×nk ,
Nk , A(:, N (k) ) ∈ Rm×(n−nk ) ,
where the cardinal nk , |B(k) | of the basic variable set is allowed to be less than the number m of rows: rank(Bk ) = nk ≤ m. Primal and dual two-phase deficient-basis approaches can be summarized as Phase I: (x ∈ G) Phase I: (y ∈ F) or (d of F) or (δ of G) || h Keep ba|sic ↓ set? i
Phase II: Dual-
|| h Keep ba|sic ↓ set? i
Phase II: Primal-
feasibility search loop
feasibility search loop
Deficient-Basis (Primal) Simplex ≡ (Standard form) NSA
Deficient-Basis (Dual) Simplex ≡ (Inequality form) NSA
In primal approaches (shown in the left), Phase I tries to determine either a basic (possibly degenerate) primal feasible point x ∈ G or a direction d of the dual feasible region F to conclude that the dual objective function ` is unbounded below in this latter case. In the former case, the deficient-basis primal simplex (also known as standard-form non-simplex active-set (NSA)) method enters Phase II to perform a dual-feasibility search loop to achieve optimality while maintaining primal feasibility and complementarity slackness. The interpretation of the dual approaches (shown in the right) is analogous; moreover, note that Phase I of the deficient-basis dual simplex method is not needed when a y ∈ F is known a priori, and that the dual simplex method could be directly applied if y ∈ F is a vertex of F. The algorithm we are proposing in this paper (called Algorithm 1) can be described by performing two modifications to the deficient-basis primal simplex method:
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
4
(i) Starting Phase II without primal feasibility achieved. (ii) Taking advantage of an initial dual feasible point. The former one—which turns the algorithm into a double nested loop instead of a two-phase approach—was already put forward by Santos-Palomo [5], and hence this paper will focus in the latter. Anyway, the entering/exchange/leaving criteria we are presenting in the next section will be necessarily different from those in [5] to be able to exploit the a priori knowledge of y ∈ F. Thus we will put into a box the sentences we have added in both the external and dual-feasibility search (i.e., internal) loop to highlight the algorithmic differences, besides the suitable change in notation to that used in this paper. The external loop of Algorithm 1 is as follows. Algorithm 1 external loop Let y (1) be dual feasible; k ← 1; B (1) ← ∅; d(1) ← −b (b 6= O) while ∅ 6= C ← {i ∈ N (k) : aTi d(k) < 0} do Determine a step length α; y (k+1) ← y (k) + αd(k) Select Q ⊆ C to enter B (k) with B (k) ∪ Q linearly independent. B (k+1) ← B(k) ∪ Q. if there is no null-space descent direction in dual space then k ← k + 1; Determine a solution yˆ(k) of BkT y = cB . (k) Determine the solution xB of Bk xB = b. Perform a dual-feasibility search loop. if `(ˆ y (k) ) ≤ `(y (k) ) then y (k+1) ← yˆ(k) endif (k)
if xB ≥ O then stop (y (k) optimal solution for (D)) endif Select P 0 to leave B (k) ; B (k+1) ← B (k) \P 0 endif Determine a null-space descent direction d(k+1) ; k ← k + 1 endwhile stop (Dual objective ` unbounded below) There are two dual iteration points involved: y (k) is a dual feasible point which is moved when appropriate in the direction d(k) with a step length α determined by the usual dual min-ratio test, and yˆ(k) is a dual exterior point which is able to attract y (k) when reaching F after the dual-feasibility search loop. In the last sentence of the external loop, we were able to infer that ` is unbounded below (rather than (D) has no solution or ` is unbounded below) because we know that y (1) ∈ F. The same reason allows us to avoid checking (k) (just before a variable exchange) whether δB ≤ O to conclude the infeasibility of (D) in the dual-feasibility search loop of Algorithm 1, which reads as follows.
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
5
Algorithm 1 dual-feasibility search loop while ∅ 6= V ← {i ∈ N (k) : ri (ˆ y (k) ) , aTi yˆ(k) − ci < 0} do if `(ˆ y (k) ) < `(y (k) ) then Determine a step length α ∈ [0, 1) for direction yˆ(k) − y (k) y (k+1) ← y (k) + α(ˆ y (k) − y (k) ) endif (k) Select q , Nq0 ∈ V to enter B (k) if aq ∈ R(Bk ) then (k) Determine the solution δB of the system Bk δB = aq (k) Select p0 , Bp to be exchanged for q. B (k+1) ← B(k) \{p0 } ∪ {q} else B (k+1) ← B(k) ∪ {q}. endif T y =c . Determine a solution yˆ(k+1) of Bk+1 B (k+1)
Determine the solution xB endwhile
of Bk+1 xB = b; k ← k + 1
In this dual-feasibility search loop we can also move y (k) ∈ F in spite of having no null-space descent direction, because yˆ(k) − y (k) can be a descent direction. This is just one way to exploit y (k) : we will exploit it too by making it involved in the pivoting rules to be presented in the next section. It is worth pointing out that in some descriptions of the simplex method, (k) (k) xB is also denoted by ˆb, δB is also denoted by a ˆq , yˆ(k) is also denoted by πB , (k) and ri (ˆ y ) is also denoted by cˆi . Finally, note that full geometrical meaning can be given to the indices of the variables in the sets C and V by noting that column i of A (i.e., that corresponding to the ith primal variable) is associated with the dual constraint ri , aTi y − ci : this is the reason why we call C the set of dual constraints that are contrary to the current descent direction d(k) , and we call V the set of dual constraints that are violated in yˆ(k) . Moreover, (k) (k) the primal variable subvector xB is associated to yˆ(k) ; in fact, xB can be considered as a vector of Lagrange multipliers for yˆ(k) from a dual viewpoint of the problem.
3
Pivoting criteria and small illustrative examples
In this section we give three small examples to illustrate how the specific pivoting rules (entering/exchange/leaving variable criteria) designed for Algorithm
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
6
1 deal with unbounded dual objectives, non-unique dual solutions and a classical exponential example by Goldfarb, thus avoiding some caveats of the dual simplex method. As regards the entering variable criterion in the external loop, a first attempt could be as follows: if d(k) is not dual feasible then (α = 0) Select an active dual constraint qb at y (k) Select a contrary dual constraint qg from among dual constraints in C that keeps on being a contrary dual constraint to the updated null-space descent direction d˜(k) (i.e., qg ∈ C(d(k) ) ∩ C(d˜(k) )) else (α > 0) Select a contrary dual constraint qg to the direction d(k) endif B (k+1) ← B (k) ∪ {qb , qg }. However, getting the second “Select” out of the conditional can lead to a decrease in the number of iterations, and hence we consider the following alternative: if d(k) is not dual feasible then (α = 0) Select an active dual constraint qb at y (k) else (α > 0) Select an active dual constraint qb at the new dual feasible point after a step in the direction d(k) with step length α > 0. endif Select a contrary dual constraint qg from among dual constraints in C that keeps on being a contrary dual constraint to the updated null-space descent direction d˜(k) (i.e., qg ∈ C(d(k) ) ∩ C(d˜(k) )) B (k+1) ← B (k) ∪ {qb , qg } The “most contrary” rule of thumb can be utilized. Note that the choice of qb obeys to a boundary (or local) viewpoint, and that of qg to a global viewpoint of the dual space. Entering more than one variable (|P| > 1) allow us to join both viewpoints. Example 3.1 (Non-unique dual solution) The application of Algorithm 1 to the linear program 13 5 7 max − 25 x1 − 11 5 x2 − 4 x3 − 2 x4 − 5 x5
s.t.
5 4 x1
−x1 x1 ,
− 12 5 x8
3 3 + 11 5 x2 − 2 x3 − 2 x4 −x5 +x6
−x2 − 56 x3 −x4 x2 ,
x3 ,
x4 ,
− 65 x8 = − 65 +x7
x5 , x6 , x7 ,
−x8 = −1 x8 , ≥
0
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
7
Figure 1. Example 3.1 dual linear program
is depicted in Fig. 1. We have marked with ri the line aTi y − ci = 0 corresponding to the ith dual constraint, we have shadowed the dual feasible region F and the two parallel dotted lines represent the contour lines `(y) = 0 and `(y) = −4. As an aside, note that A → B → C → D is the path followed by the dual simplex method. Starting with the dual feasible point y (1) as the origin, the basic variable set B (1) = ∅ and the descent direction d(1) = −b = [6/5; 1] (cf. Fig. 1), the indices of the contrary dual constraints to the direction d(1) are C(d(1) ) = {3, 4, 5, 8}, so aT3 d(1) −3 =√ , ka3 k2 369/10
aT4 d(1) −14/5 =√ , ka4 k2 13/2
aT5 d(1) = −6/5, ka5 k2
aT8 d(1) −61/25 = √ , ka8 k2 61/5
and the dual min-ratio test gives α = 25/28; we then move the dual feasible point y (1) in the direction d(1) to obtain y (2) = [15/14; 25/28] (shown with a circle in Fig. 1), where constraint qb = 4 is active. Now, only dual constraints in {3, 8} keep on being contrary to the updated direction d˜(1) = [−10/3; 5], C(d(1) ) ∩ C(d˜(1) ) = {3, 8}. Hence −1 aT3 d˜(1) =√ , ka3 k2 369/10
−1 aT8 d˜(1) =√ , ka8 k2 61/5
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
8
and we select the most contrary dual constraint, in this case constraint qg = 8. Summing up, P = {4, 8} enter the basic variable set and thus B (2) = {4, 8}. Since |B(2) | = m = 2, there is no corresponding null-space descent direction in the dual space. −3/2 −6/5 T By solving the systems B2 y = cB and B2 xB = b for B2 = −1 −1 (2) and cB = [−5/2; −12/5], we obtain the dual iteration point yˆ = [1/3; 2] (cf. point D in Fig. 1, marked with a plus sign) and its associated primal (2) variable subvector xB = [0; 1], respectively. The dual slacks corresponding to yˆ(2) are r(ˆ y (2) ) = AT yˆ(2) − c = [11/12; 14/15; 7/20; 0; 16/15; 1/3; 2; 0], so there are no violated dual constraints at yˆ(2) and the algorithm does not enter in the dual-feasibility search loop. Since `(ˆ y (2) ) = −12/5 < −61/28 = `(y (2) ), we can move the dual feasible point to obtain y (3) = yˆ(2) = [1/3; 2] (cf. point D (2) in Fig. 1, marked with a circle). Its associated primal variable subvector xB is non-negative, so y (3) is an optimal solution of (D) and the minimum value of the dual objective function is `(y (3) ) = −12/5. Note that we could have adopted the criterion described above as first attempt. In this example it implies that qb = 4 does not enter the basis and that only qg = 8 enters the basis in the first iteration, so we have no null-space descent direction in the dual space. When solving the corresponding underdetermined system we obtain the dual exterior point yˆ(2) = [72/61; 60/61] (marked with a plus sign in Fig. 1) as its minimum-norm solution; then, the algorithm cannot move the dual feasible point y (2) (α = 0) and enters the dual-feasibility search loop to choose variable 4 to enter the basis (the only violated dual constraint at this dual exterior point). Next, the dual point yˆ(3) = [1/3; 2] (D in Fig. 1) is determined, which attracts y (2) towards it (i.e., y (4) = yˆ(3) = [1/3; 2]) and the method stops with optimal solution y (4) since dual feasibility, primal feasibility and complementarity slackness have been achieved. Example 3.2 (Unbounded dual objective) The linear program whose dual is min −2y1 −3y2 +y3 +12y4 s.t. y1 ≥0 y2 ≥0 y3 ≥0 y4 ≥ 0 2y1 +9y2 −y3 −9y4 ≥ 0 − 31 y1 −y2 + 13 y3 +2y4 ≥ 0 was constructed by Kuhn (cf. [15, p. 351]) to show that cycling can occur in
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
9
the dual simplex method. We initiate Algorithm 1 with the dual feasible point y (1) as the origin, which is a degenerate dual vertex where all six dual constraints are active. Starting with the basis B (1) = ∅ and the descent direction d(1) = −b = [2; 3; −1; −12], the indices of the contrary dual constraints to the direction d(1) are {3, 4, 6}, so aT3 d(1) = −1, ka3 k2
aT4 d(1) = −12, ka4 k2
aT6 d(1) −28 =√ , ka6 k2 47/3
and 6 is selected; since only dual constraint 4 keeps on being contrary to the updated direction d˜(1) = [0; −1/3; 0; −1/6], P = {6, 4} enter the basic variable set and thus B (2) = {6, 4}. Then, a corresponding null-space descent direction in the dual space is d(2) = [1; −1/3; 0; 0]. The indices of the contrary dual constraints to d(2) are {2, 5}: aT2 d(2) = −1/3, ka2 k2
aT5 d(2) −1 =√ , ka5 k2 167
aTi d(2) ≥ 0 for i = {1, 3}
and 2 is selected; since dual constraint 5 is not contrary to the updated direction d˜(2) = [1; 0; 1; 0] (i.e., C(d(2) ) ∩ C(d˜(2) ) = ∅), only P = {2} enter the basis and B (3) = {6, 4, 2}. The direction d(3) = d˜(2) is such that AT d(3) ≥ O, and hence ` is unbounded below. Note that we can conclude this without moving the initial dual feasible point and without computing any exterior dual iteration point. We have already described the entering variable criterion in the external loop. As regards the entering/exchange variable criterion in the internal (dualfeasibility search) loop, a first attempt would be to consider dual local information only in the external loop and hence: Select q as the most violated dual constraint at yˆ(k) . However, considering dual local information only in both loops can lead to a decrease in the number of iterations, which is accomplished by the following alternative: if yˆ(k) − y (k) is a descent direction then (α ∈ (0, 1)) Select an active dual constraint q at the new dual feasible point after a step in the direction yˆ(k) − y (k) with step length α > 0 else (α = 0) Select q as the most violated dual constraint at yˆ(k) endif As regards the exchange/leaving variable criterion in the internal loop, our
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
10 (k)
proposal is to use the primal min-ratio test (even if some xBi < 0): (k)
Let δB be the solution of the system Bk δB = aq . Determine p0 as (k)
τ=
xBp
(k) δBp
( = min
1≤i≤nk
(k)
xBi
(k) δBi
) :
(k) δBi
>0 ,
where
p0 , Bp(k) .
Finally, the leaving variable criterion that allows the restarting procedure in the external loop is: (k)
Select p0 as the variable in B (k) with most negative component in xB . None of these strategies suffices to ensure the convergence to an optimal solution the first time that yˆ(j) reaches F, as it is illustrated in the following example. Example 3.3 (Goldfarb’s exponential example) The linear program whose dual is min β(2 − β 2 )y2 + (1 −β 2 )y3 s.t. y1 ≥0 −βy1 + y2 ≥0 y1 − βy2 + y3 ≥ 0 −y1 ≥ −1 −βy1 − y2 ≥ −γ y1 − βy2 − y3 ≥ −γ 2 was constructed by Goldfarb (cf. [12]) to show that the dual simplex method with the steepest-edge pivoting rule can be forced to visit all intervening vertices (F is combinatorially equivalent to the m-cube). The application of both Algorithm 1 and the dual simplex method to a particular case of this linear program is depicted in Fig. 2, where we have labeled the vertices of F in lexicographical ordering to indicate the path followed by the dual simplex method. Note that we have shown only four out of the six faces of F: Constraint Dual vertices Shown 1 A − D − E − H No 2 A − B − G − H No 3 A − B − C − D Yes 4 B − C − F − G Yes 5 C − D − E − F Yes 6 E − F − G − H Yes
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
11
Figure 2. Goldfarb polytope (m = 3; β = 2; γ = 5)
We start with the initial dual feasible point y (1) (A in Fig. 2) as the origin (in which dual constraints 1, 2 and 3 are active), the basis B (1) = ∅ and the descent direction d(1) = −b = [0; 4; 3] (cf. Fig. 2). The indices of the contrary dual constraints to the direction d(1) are {3, 5, 6}; using the second entering variable criterion in the external loop, P = {3, 6} enter the basic variable set and thus B (2) = {3, 6} and a null-space descent direction in the dual space is d(2) = [1/2; 1/4; 0]. The indices of the contrary dual constraints to d(2) are {2, 4, 5} and only constraint 2 is blocking, hence P = {2} and B (3) = {3, 6 2}. Since |B(3) | = m = 3, there is no corresponding null-space descent direction. The first exterior dual iteration point (marked with a plus sign in Fig. 2) and its associated primal variable subvector are yˆ(3) = [25/6; 25/3; 25/2] and (3) xB = [−1/6; 17/6; 4/3]. The indices of the violated dual constraints at yˆ(3) are {4, 5} and the algorithm enters the interior loop. Since `(ˆ y (3) ) = −425/6 < (3) 0 = `(y ), the dual feasible point is moved along the direction yˆ(3) − y (3) during a step length α = 6/25 to obtain y (4) = [1; 2; 3] (B vertex in Fig. 2). Applying the exchange/leaving variable criterion and dual local information in both loops, the index of the dual constraint to be exchanged for q = 4 (the constraint that is active after performing the movement) is p0 = 3, because
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
12 (3)
δB = [1/6; 1/6; 2/3]. The new basis is B (4) = {6, 2, 4}, and the exterior dual iteration point and (4) its primal variable subvector are yˆ(4) = [1; 2; 22] and xB = [3; 2; −1]. Hence, the exterior dual iteration point reaches F for first time on vertex G in Fig. 2. Since there are no violated dual constraints in yˆ(4) the algorithm exits the interior loop and forces the dual feasible point y (5) = [1; 2; 22] to meet yˆ(4) . But (4) the primal variable subvector xB associated with yˆ(4) is not non-negative, and then we have to proceed to the restarting procedure, namely choosing p0 = 4 as (4) the leaving variable (most negative component in xB ) and recovering a nullspace descent direction d(5) = [−1; −2; 3] in the dual space, which corresponds to B (5) = {6, 2}. The only contrary dual constraint to d(5) is constraint 1 and there are no blocking constraints, so we can move the dual feasible point y (5) along d(5) with α = 1 to reach the H dual vertex y (6) = [0; 0; 25]. Then 1 enters the basis—thus B (6) = {6, 2, 1}—and there is no null-space descent direction. A new exterior dual iteration point yˆ(6) = [0; 0; 25] and its associated primal (6) variable subvector xB = [3; 2; 1] are determined by solving B6T y = cB and (6) B6 xB = b. Now the algorithm can stop with non-negative xB for this new exterior dual iteration point yˆ(6) , since it coincides with y (6) . Finally, note what happens when dual local information only in the external loop is considered and hence q is selected as the most violated dual constraint at yˆ(3) , instead of the dual constraint that is active after performing the movement. In this case, q = 5 (the most violated dual constraint at yˆ(3) ) is chosen instead of q = 4; the variable to be exchanged (3) for q is again p0 = 3, because δB = [2/3; 2/3; 5/3]. Now the new basis is B (3bis) = {6, 2, 5}, and the exterior dual iteration point yˆ(3bis) (close to G and marked with a plus sign in Fig. 2) and its associated primal variable (3bis) subvector are yˆ(3bis) = [5/4; 5/2; 85/4] and xB = [3; 7/4; −1/4]. Constraint 4 is the only violated dual constraint at yˆ(3bis) and the algorithm does not exit the internal loop. Although `(ˆ y (3bis) ) = −295/4 < −17 = `(y (3bis) ) (with y (3bis) = [1; 2; 3]), we cannot move the dual feasible point along the direction yˆ(3bis) − y (3bis) since α = 0 and then y (4) = y (3bis) . Once again, the application of the exchange/leaving variable criterion and dual local information only in the external loop determines that p0 = 5 has to be exchanged for q = 4 (3bis) (because δB = [0; 1/4; 1/4]), and the new basis is B (4) = {6, 2, 4}. Summing up, we will obtain the same basis with the dual iteration point in the same place, but one additional exterior dual iteration point and one additional iteration have been needed. No computational results are reported in this paper, but in the next section we will sketch some details of a sparse projected-gradient implementation.
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
13
4
cholmod-based sparse implementation
Since the only theoretical results about deficient-basis simplex-type methods that we were able to obtain involve least-squares solutions of the subproblems occurring at intermediate steps, we choose to work with the QR factorization of Bk to obtain preliminary computational results in a dense framework, where Bk ∈ Rm×nk (nk ≤ m) is a full column-rank matrix with a subset of columns of the fixed full row-rank matrix A ∈ Rm×n (m ≤ n). We also want to take advantage of the existing relationship between Bk and Bk+1 , where Bk+1 is the resulting full column-rank matrix after adding/deleting/exchanging columns of A to/from Bk , and this is easily accomplished with a QR factorization. When the linear programs to be solved are dense, the dense QR factorization of Bk with an explicit orthonormal factor is advisable. But we are mainly interested in solving sparse linear programs with numerical difficulties, and hence the orthonormal factor cannot be explicitly maintained for sparsity issues. This is not a problem because the transpose of the triangular factor Rk of the QR factorization of Bk must coincide (barring signs) with the Cholesky factor Lk would define implicitly a basis for R(Bk ). Hence, of BkT Bk , and Qk , Bk L−T k the triangular factor of the sparse QR factorization of Bk can be maintained by using Davis and Hager’s cholmod package. It is important to realize that we are working on top of a static structure, and hence we have to describe how to set it up. Let RT be a Cholesky factor of the symmetric positive semidefinite matrix AT A. It can be deduced from [16, 17] that Lk , chol(BkT Bk ) ⊂ chol(AT A) , RT , where the subset operation must be interpreted in structural sense, namely the non-zero structure of Lk fits into that of RT . Then the column ordering of A (chosen to sparsify R) dictates that of the columns of Bk . Thus, a permutation P is computed such that P T AT AP has a Cholesky factor RT as sparse as possible, and the columns of A are a priori permuted in such a way that the natural order coincides with the computed order. As we pointed out before, the static structure of RT (which can be computed via a symbolic factorization of AT A) has enough space to accommodate any Lk (i.e., RT is an upper bound for the memory needed) when a suitable row and column numbering is used. To illustrate this fact, let us consider e.g. n = 6, m = 4, nk = 3, B (k) = [2, 4, 5], and
×
⊗ × × T . R = ⊗×⊗ ⊗ ××××××
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
14
The key point is that Lk is stored in rows 2, 4 and 5 and columns 2, 4 and 5 of RT (marked with a circle on top of the crosses defining the static structure), which can be expressed as Lk = RT (B (k) , B (k) ) in Matlab notation. Now we are ready to show the cholmod connection. In cholmod, a matrix A ∈ Rn×m is involved. (Mathematical bold font is used to avoid confusion with matrices previously defined.) Davis and Hager [13] maintain the sparse Cholesky factor of a symmetric matrix C , αIn + AAT ∈ Rn×n . The maintenance of the sparse Cholesky factor of C can be performed with several operations: • It can be updated when a row of A is initially zero and becomes nonzero (row-addition operation, [13, §2]) • It can be downdated when a row of A is nonzero and becomes zero (rowdeletion operation, [13, §3]). • It can be modified after a row exchange by performing a row-deletion followed by a row-addition (arbitrary row-modification, [13, §4.1]). The crucial issue to understand how cholmod can be exploited in a deficientbasis context is as follows. In cholmod, let A be a matrix with the same structure of AT ∈ Rn×m but with all its elements set up initially to zero, and let α be zero (C would be no longer positive definite, but this is not a problem [13, p. 624]) or else a tiny positive number (to ensure that C is positive definite). Now, the rows of BkT are added sequentially to the structure as the deficient-basis algorithm proceeds, with row deletions and row exchanges allowed. cholmod is available as a set of C routines from Davis’ web page at the University of Florida, and some of its features have been already incorporated into recent releases of Matlab. We expect that row updating/downdating capabilities will be incorporated into forthcoming Matlab releases. Once Lk is available, the solution of the least-squares subproblems that occur in Algorithm 1 is computed by the corrected seminormal equations (CSNE) of first, second or third kind, where fixed precision iterative refinement (FPIR) must be done carefully. Let us briefly describe the details: • A null-space descent direction d(k) can be computed by first solving the linear least-squares problem min
kBk xB − bk2
with one step of FPIR after solving the seminormal equations of first kind BkT Bk xB = Lk LTk xB = BkT b, and then computing the opposite of the residual to obtain the steepest-
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
15
descent null-space descent direction [15, p. 377–378]: (k)
d(k) = Bk xB − b , PN (BkT ) · (−b). It is important to note that d(k) must be updated (not recomputed) in a modified (as opposed to classical) FPIR scheme. • The same technique can be used when the compatible system Bk xB = b
with b ∈ R(Bk )
must be solved, and when testing whether v ∈ R(Bk ) by checking if the 2-norm of the residual of min kBk δB − vk2 vanishes. • The exterior dual iteration point yˆ(k) can be obtained by projecting the dual iteration point y (k) onto the linear manifold defined by the underdetermined system BkT y = cB , namely by solving min
ky − y (k) k2
s.t. BkT y = cB .
Its solution can be obtained by using the CSNE of third kind, which amounts to performing one step of modified FPIR after solving the seminormal equations of third kind BkT Bk z = Lk LTk z = cB − BkT y (k)
and
yˆ(k) = Bk z (k) + y (k) .
• Another alternative for the exterior dual iteration point yˆ(k) is to compute the minimum 2-norm solution of the underdetermined system BkT y = cB . This can be readily obtained by letting y (k) be the zero vector in the above formulation, which is also known as the CSNE of second kind. It is important to note that yˆ(k) must be updated (not recomputed) in a modified (as opposed to classical) FPIR scheme. The same technique can be used when |B(k) | = m. As pointed out by Bj¨orck in [18], the computation of the Cholesky factorization after forming BkT Bk (as opposed to directly computing the triangular factor of the QR factorization of Bk ) implies that only one step of FPIR may be not enough to obtain accurate solutions when the underlying problem is severely ill-conditioned.
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
16
5
Concluding remarks and future work
Two modifications to the deficient-basis primal simplex method have been shown to be able to address what could be thought of as the deficient-basis dual counterpart of Paparrizos, Samaras and Stephanides’ primal-dual simplex-type algorithm, namely taking advantage of the knowledge of a dual feasible point. Specific pivoting rules—with full geometrical meaning from the dual space— have been designed for the proposed algorithm to be able to deal with several caveats of the dual simplex method, and these rules have been illustrated with carefully chosen examples—solved in detail—to highlight their potential usefulness when facing with much larger, numerically difficult linear programs. It is not straighforward at all to exploit new sparse linear algebra packages in a simplex-type environment, and the situation is even worse in a deficient-basis one. Nearly all these packages take into account recent hardware improvements but consider the updating/downdating features as work to be done. A step forward has been sketched in this paper, where the fact that Davis and Hager’s cholmod sparse Cholesky factorization is row updatable/downdatable is exploited in a non-trivial manner to design a sparse projected-gradient implementation. As future work we plan to obtain some computational results when this cholmod-based sparse implementation is fine-tuned. Then, a fair duel with well-known linear programming solvers will take place.
References [1] Ping-Qi Pan. A basis-deficiency-allowing variation of the simplex method for linear programming. Computers Math. Applic., 36(3):33–53, August 1998. [2] Ping-Qi Pan and Yun-Peng Pan. A phase-1 approach for the generalized simplex algorithm. Computers Math. Applic., 42(10/11):1455–1464, November 2001. [3] Ping-Qi Pan, Wei Li, Yong Wang, and Tao Chen. A Phase-I algorithm using the most-obtuseangle rule for the basis-deficiency-allowing dual simplex method. Presented at The First International Conference on Optimization Methods and Software, December 15–18, 2002, Hangzhou, China, December 2002. [4] Ping-Qi Pan, Wei Li, and Yong Wang. A Phase-I algorithm using the most-obtuse-angle rule for the basis-deficiency-allowing dual simplex method. OR Transactions (Chinese), 8(3):88–96, 2004. ´ [5] Angel Santos-Palomo. The sagitta method for solving linear programs. Europ. J. Operl. Res., 157(3):527–539, September 2004. ´ [6] Angel Santos-Palomo and Pablo Guerrero-Garc´ıa. A non-simplex active-set method for linear programs in standard form. Stud. Inform. Control, 14(2):79–84, June 2005. [7] Ping-Qi Pan. A revised dual projective pivot algorithm for linear programming. SIAM J. Optim., 16(1):49–68, September 2005. [8] Ping-Qi Pan, Wei Li, and Jun Cao. Partial pricing rule simplex method with deficient basis. Numer. Math. J. Chin. Univ. (Engl. Ser.), 15(1):23–30, February 2006. ´ [9] Wei Li, Pablo Guerrero-Garc´ıa and Angel Santos-Palomo. A basis-deficiency-allowing primal Phase-I algorithm using the most-obtuse-angle column rule. Computers Math. Applic., 51(6/7):903–914, March 2006. [10] Katta G. Murty. Linear Programming. John Wiley and Sons, New York, NY, USA, September 1983. [11] Konstantinos Paparrizos, Nikolaos Samaras, and George Stephanides. A new efficient primal dual simplex algorithm. Comput. Oper. Res., 30(9):1383–1399, August 2003.
October 4, 2007
23:26
Optimization Methods and Software
ddpssoms
17 [12] Donald Goldfarb. On the complexity of the simplex method. In S. G´ omez and J.-P. Hennart, eds, Advances in Optimization and Numerical Analysis, volume 275 of Mathematics and Its Applications, pages 25–38. Kluwer Academic Publisher, Dordrecht, The Netherlands, April 1994. [13] Timothy A. Davis and William W. Hager. Row modifications of a sparse Cholesky factorization. SIAM J. Matrix Anal. Appl., 26(3):621–639, March 2005. [14] Wayne L. Winston. Operations Research: Applications and Algorithms. Duxbury Press, Belmont, CA, USA, fourth edition, January 2004. [15] Philip E. Gill, Walter Murray, and Margaret H. Wright. Numerical Linear Algebra and Optimization, Vol. 1. Addison-Wesley, Redwood City, 1991. ˚ Bj¨ [16] Ake orck. A direct method for sparse least squares problems with lower and upper bounds. Numer. Math., 54(1):19–32, October 1988. [17] Thomas F. Coleman and Laurie A. Hulbert. A direct active set algorithm for large sparse quadratic programs with simple lower bounds. Math. Progr., 45(3):373–406, December 1989. ˚ Bj¨ [18] Ake orck. Comment on the iterative refinement of least squares solutions. J. Amer. Stat. Assoc., 73(361):161–166, March 1978.