A polynomial primal-dual Dikin-type algorithm for linear ... - CiteSeerX

Report 0 Downloads 13 Views
A polynomial primal-dual Dikin-type algorithm for linear programming Report 93-36

B. Jansen C. Roos T. Terlaky

Faculteit der Technische Wiskunde en Informatica Faculty of Technical Mathematics and Informatics Technische Universiteit Delft Delft University of Technology

ISSN 0922-5641

Copyright c 1993 by the Faculty of Technical Mathematics and Informatics, Delft, The Netherlands. No part of this Journal may be reproduced in any form, by print, photoprint, microfilm, or any other means without permission from the Faculty of Technical Mathematics and Informatics, Delft University of Technology, The Netherlands. Copies of these reports may be obtained from the bureau of the Faculty of Technical Mathematics and Informatics, Julianalaan 132, 2628 BL Delft, phone +3115784568. A selection of these reports is available in PostScript form at the Faculty’s anonymous ftp-site. They are located in the directory /pub/publications/tech-reports at ftp.twi.tudelft.nl

DELFT UNIVERSITY OF TECHNOLOGY

REPORT Nr. 93{36 A Polynomial Primal{Dual Dikin{type Algorithm for Linear Programming

B. Jansen, C. Roos, T. Terlaky

ISSN 0922{5641 Reports of the Faculty of Technical Mathematics and Informatics Nr. 93{36 Delft 1993 i

B. Jansen, C. Roos and T. Terlaky, Faculty of Technical Mathematics and Informatics, Delft University of Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands. e{mail: [email protected], [email protected], [email protected] This work is completed with the support of a research grant from SHELL. The rst author is supported by the Dutch Organization for Scienti c Research (NWO) under grant 611-304-028. The third author is on leave from the Eotvos University, Budapest, and partially supported by OTKA No. 2116.

c 1993 by Faculty of Technical Mathematics and InforCopyright matics, Delft, The Netherlands. No part of this report may be reproduced in any form, by print, photoprint, micro lm or any other means without written permission from Faculty of Technical Mathematics and Informatics, Delft University of Technology, The Netherlands.

ii

Contents

1 Introduction 2 A new ane scaling direction

1 4

3 The Primal{Dual Ane Scaling Algorithm

8

2.1 De nition : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.2 Relation with the old AFS direction : : : : : : : : : : : : : : : : : : 2.3 More on the v ?space : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.1 The step size : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

4 6 7

8

4 Reduction of the duality gap 5 The barrier function

9 10

6 Convergence analysis

16

5.1 De nition and derivatives : : : : : : : : : : : : : : : : : : : : : : : : 10 5.2 Bounds for the barrier function value : : : : : : : : : : : : : : : : : : 11 5.3 Reduction of the barrier function : : : : : : : : : : : : : : : : : : : : 13

Abstract

In this paper we present a new primal{dual ane scaling method for linear programming. The method yields a strictly complementary optimal solution pair, and also allows a polynomial{time convergence proof. The search direction is obtained by using the original idea of Dikin, namely by minimizing the objective function (which is the duality gap in the primal{dual case), over some suitable ellipsoid. This gives rise to completely new primal{dual ane scaling directions, having no obvious relation with the search directions proposed in the literature so far. The new directions guarantee a signi cant decrease in the duality gap in each iteration, and at the same time they drive the iterates to the central path. In the analysis of our algorithm we use a barrier function which is the natural primal{dual generalization of Karmarkar's potential function. The iteration bound is O(nL), which is a factor O(L) better than the iteration bound of an earlier primal{dual ane scaling method (Monteiro, Adler and Resende, 1990). Key words: interior-point method, ane scaling method, primal{dual method.

iii

1 Introduction In this paper we present a new primal{dual ane scaling method for linear programming (LP). The notion of ane scaling has been introduced by Dikin [2], in 1967, as a tool for solving the (primal) problem in standard format (P ) minfcT x : Ax = b; x  0g: The underlying idea is to replace the the nonnegativity constraints x  0 by the ellipsoidal constraint kX ?1(x ? x)k  1; where x denotes some given interior feasible point, and X the diagonal matrix corresponding to x. The problem of minimizing the objective function cT x over the intersection of this ellipsoid and the ane space determined by the ane constraints Ax = b can easily be solved. The solution is given by

x = x ? XPAX (Xc): Dikin showed, under the assumption of nondegeneracy, that the process x ! x converges to an optimal solution of (P ). This so{called ane scaling (AFS) method of Dikin remained unnoticed until 1985. The epoch making work of Karmarkar [13], in 1984, initiated a tremendous amount of research in polynomial{time methods for LP, and gave rise to many new and ecient interior point methods (IPMs) for LP. The bibliography of Kranich [15] contains more than 1300 papers on the subject. For a survey we refer to Goldfarb and Todd [3], Gonzaga [7], den Hertog [10], den Hertog and Roos [11], Todd [27]. For reports on numerical eciency of these methods we mention Lustig, Marsten and Shanno [16], McShane, Monma and Shanno [17], Mehrotra [19]. The importance of Dikin's approach is that in all polynomial{time IPMs for LP, proposed after the work of Karmarkar, the search direction turns out to be a linear combination of the AFS direction ?XPAX (Xc) and another direction, the socalled centering direction. See, e.g. Yamashita [30], Gonzaga [5], and den Hertog and Roos [11]. The centering direction is given by XPAX (e), where e denotes the all{one vector in IRn . It has the property that it drives the iterates to the so{called analytic center of the feasible region. This notion (of analytic center) was introduced by Sonnevend [26]. In the polynomial{time IPMs the ane scaling component of the search direction gives the progress to optimality, whereas the centering component in the search direction keeps the iterates away from the boundary of the feasible region. By taking appropriate linear combinations of the two directions one is able to approximately follow the so{ called central path of the problem to an optimal solution pair. See, e.g. Gonzaga [6], Megiddo [18], Monteiro and Adler [21], Renegar [23], Roos and Vial [24]. Let us recall that every known method for solving (P ) also solves the dual problem of (P ), given by (D) maxfbT y : AT y + s = c; s  0g: Apart from the eciency aspect, the known polynomial{time IPMs for solving (P ) are also attractive because the solutions they generate are (close approximations of)

1

the analytic centers of the primal and the dual optimal set. This has been shown by Guler and Ye [9]. This implies that the obtained optimal solution pair (x; (y; s)) is strictly complementary. The existence of such optimal solution pairs has been known for a long time, due to a theorem of Goldman and Tucker [4], 1956, but this result was considered to be of theoretical interest only. Very recently it has become clear that the availability of a strictly complementary optimal solution pair is essential for reliable sensitivity analysis. See Adler and Monteiro [1], Jansen et al. [12] for the theoretical aspects and Greenberg [8] for a discussion of the practical merits. So, from this point of view, it seems to be appropriate to say that a method for solving LP problems needs to generate a strictly complementary optimal solution pair. There is a strong indication that in IPMs the centering component in the search direction is essential for obtaining a strictly complementary optimal solution pair in polynomial time. Without using the centering component, i.e. in case of a pure (primal) AFS method, no polynomial{time convergence proof exists. Surprisingly enough, very recent results of Tsuchiya and Muramatsu [29], 1992, make clear that there is some centering e ect in the (primal) AFS direction. Tsuchiya was the rst who showed the convergence of some (primal) AFS method without any nondegeneracy assumption [28]. The strongest result [29] is that the AFS method with step size 2/3 of the maximal step size (up to the boundary of the feasible region) converges to an optimal solution of (P ). In fact, Tsuchiya's work has made clear that the dual solution generated by his AFS method is the analytic center of the dual optimal set. On the primal side however, the method generates an interior point of the primal optimal set, but in case of degeneracy, not necessarily the analytic center of this set. In this paper we present a primal{dual AFS method which has the properties that it not only yields a strictly complementary optimal solution pair, but also allows a polynomial{time convergence proof. It is not the rst AFS method having these properties. In 1990, Monteiro, Adler and Resende [22] presented an AFS method of this type with an O(nL2) iteration bound. Let us brie y discuss the relation and the di erences between the two approaches. First, both methods are primal{ dual methods. This means that in each iteration a given primal{dual solution pair (x; (y; s)) is improved by moving simultaneously in the primal and the dual space. To describe the search direction used by Monteiro et al., let us recall that the primal{ dual IPMs obtain their search directions by solving the linear system

Ax = 0; = 0; X s + S x = e ? Xs; AT y + s

where e denotes the all-one vector in IRn , and  is a nonnegative real number. The solution of this system is given by x = ?DPAD D(c ? X ?1 e); y = (AD2 AT )?1 (b ? AS ?1 e); s = ?AT y;

2

where D := (XS ?1) 12 and PAD denotes the matrix of projection onto the null space of the matrix AD. Note that if  = 0 then x = ?DPAD (Dc) and y = (AD2AT )?1 b. In fact, these are the directions used by Monteiro et al. Note the similarity in the expression for x with the expression for the primal AFS direction: The diagonal matrix X in the primal AFS direction is replaced by D. A similar relation holds for the displacement in the dual space. By minimizing the dual objective over a suitable ellipsoid the dual AFS direction turns out to be given by (AS ?2AT )?1 b. Replacing S ?1 by D in this expression one obtains the primal{dual AFS direction in the dual space. This explains why Monteiro et al. considered these directions to be the natural candidates for AFS directions in the primal{dual case. One of the aims of this paper however is to show that in the primal{dual case the idea of AFS can be interpreted di erently, and in fact more naturally. The main tool in deriving our direction will be the original idea of Dikin, namely minimize the objective, which is the duality gap in the primal{dual case, over some suitable ellipsoid. As we will see in the next section this works out very well, and gives rise to completely di erent primal{dual AFS directions. To avoid confusion we will call these directions the new AFS directions. They will be introduced in the next section. There we will also further discuss the relations with the 'old' AFS directions. It is quite surprising, that there is no obvious relation between the new search directions and the search directions proposed in the literature so far. As we will see the new directions, just as the old AFS directions, are completely de ned in terms of the current primal and dual iterates. They do not use an external parameter, like  in the above system. Furthermore, like the old directions, they not only guarantee a signi cant decrease in the duality gap (see Section 4), but as will become clear in Section 6, at the same time they drive the iterates to the so{called central path. The latter property ensures that the optimal solution pair generated by our algorithm, which is described in Section 3, will be strictly complementary. In the analysis of our algorithm we will make use of a barrier function which is closely related to the potential function used by Karmarkar. In fact it is the natural primal{dual generalization of Karmarkar's potential function. One of its striking properties is that it is homogeneous in the iterates. Properties of this function are derived in Section 5, by using techniques which have become more or less standard. The analysis of our algorithm forces us to take quite small steps. However, our steps are essentially larger than the steps in the approach of Monteiro et al. This explains why our iteration bound (O(nL)) improves their bound (O(nL2)) with a factor L. This improvement is due to the fact that the new AFS directions automatically center, contrary to the old AFS directions. Of course, our bound is still a factor pn worse than the best polynomial{time bounds. It will be the subject of further research to improve the iteration bound. As far as notations are concerned, e denotes the vector of all ones of appropriate size, and I the identity matrix. For any vector x = (x1; : : :; xn ), xT denotes the transpose; the same notation is used for matrices. The capital X denotes the diagonal matrix with the entries of x on the diagonal. If f is any function IR ! IR, then we denote by f (x) the vector (f (x1); : : :; f (xn)). Furthermore, if s is another vector, xs will

3

denote the coordinatewise product of x and s. So we have xs = Xs. Finally, k:k denotes the l2?norm.

Acknowledgement:

We consider the approach presented in this paper as a nice synthesis of the work of Dikin and Karmarkar. We kindly acknowledge the visits of Dr. T. Tsuchiya (Tokyo) and Prof. C.C. Gonzaga (Rio de Janeiro) to Delft, earlier this year. Their lectures, on the analysis of the primal AFS method, gave rise to the conviction that similar things could be done in a primal{dual context. We also acknowledge some useful comments of J.Ph. Vial (Geneve) and P. Ling (University of East Anglia, UK) on an earlier version of this paper.

2 A new ane scaling direction

2.1 De nition

Let (x; s) be a pair of primal{dual interior feasible solutions, that is

Ax = b; x > 0; = c; s > 0:

AT y + s

As usual, it will be assumed throughout that such a pair exists. It is well known that this basic assumption can be made without loss of generality. We also assume that A is an m  n matrix with rank m. As a consequence, any dual feasible y is uniquely determined by its slack vector s = c ? AT y . Let x; y; s denote the search directions in the respective spaces. Neglecting for the moment the nonnegativity conditions, feasibility will be maintained by requiring that

Ax = 0; = 0:

AT y + s

Following the idea of Dikin [2], we replace the nonnegativity conditions by requiring that the next iterates (x + x; y + y; s + s) belong to a suitable ellipsoid. We de ne this ellipsoid by requiring that

kX ? x + S ? sk  1; 1

1

and call this the (primal{dual) Dikin ellipsoid. It may be noted that this ellipsoid is highly degenerate. For example, the linear space determined by the equation X ?1x + S ?1 s = 0 is contained in the Dikin ellipsoid. Our aim is to minimize the duality gap xT s. The new duality gap is given by (x + x)T (s + s) = xT s + xT s + sT x:

4

Here we used that x and s are orthogonal, because x is in the null space of A and s is in the row space of A. Now minimizing the new duality gap over the Dikin ellipsoid amounts to solving the following optimization problem:  min sT x + xT s : Ax = 0; AT y + s = 0; kX ?1x + S ?1sk  1 :

We proceed by showing that this problem uniquely determines the search direction vectors. For that purpose we apply the common rescaling of the variables. The vectors d and v are determined by

d := (xs?1 ) 21 ;

v := (xs) 21 : Using d we can rescale both x and s to the same vector v : d?1 x = ds = v: We also use d to rescale x and s: px := d?1x;

ps := ds:

Note that the orthogonality of x and s implies that px and ps are also orthogonal. Now we may write

xs + sx = xd?1 ds + sdd?1x = v(px + ps); and, similarly,

x?1 x + s?1 s = x?1 dd?1 x + s?1 d?1 ds = v ?1 (px + ps): As a consequence the objective function in the above optimization problem reduces to v T (px + ps ) and the ellipsoid constraint to kv ?1(px + ps )k  1. The ane constraints in this problem can be reformulated as

ADpx = 0; y + ps = 0:

DAT d

Clearly px and ps can be uniquely characterized as the components of the vector

pv := px + ps ; in the null space and the row space of the matrix AD, respectively. Therefore, we can reformulate the problem in terms of the vector pv alone: 



min v T pv : kv ?1 pv k  1 : This problem can easily be solved. The solution is given by 3 pv = ? kvv2 k ;

5

and the objective value by

v T pv = ?kv 2k:

The solution of the original problem now easily follows: x = dPAD (pv ) s = d?1 QAD (pv ); where PAD and QAD denote the orthogonal projections onto the null space of AD and the row space of AD, respectively.

2.2 Relation with the old AFS direction

It may be noted that the above directions di er from the 'old' AFS directions for the primal{dual case. This becomes clear by noting that px ; py ; ps are uniquely determined by the system

ADpx T DA py + ps

= 0 = 0

3 px + ps = ? kvv 2 k :

This can be written as

Ax = 0; = 0; 2 2 X s + S x = ? kxxssk : AT y + s

In the old case the right hand side member in the third equation of this system is ?xs, as may be clear from the discussion in Section 1. (See also, e.g., [14, 21, 20].) Observe that in the above system this vector is multiplied with a unit vector in the direction of xs. So we may conclude that the new AFS direction coincides with the old AFS direction if and only if the vector xs is a scalar multiple of the all{one vector e; as is well known this happens if and only if the pair (x; s) is perfectly centered. The relation between the two directions becomes more apparent by noting that, up to some scalar factor, the old AFS directions are given by dPAD (qv ) for the x?space and d?1 QAD (qv ) for the s?space, where

qv := ? kvvk :

Note that qv is the solution of the problem 



min v T qv : kqv k  1 ; in which the domain is the unit sphere in the v ?space. Note that as far as the v?space is concerned, the new direction is obtained by replacing this sphere by an

6

ellipsoid which is contained in the nonnegative orthant of the v ?space, and which is maximal with respect to this property. Observe that this is exactly the way in which Dikin de ned the ane scaling direction for the primal problem, namely by minimizing cT x over the primal Dikin ellipsoid fx : kX ?1xk  1g: The analogy just described justi es the name ane scaling direction for the new direction more than for the old directions. In fact, from the above point of view we may say that the old AFS directions are obtained without scaling!

2.3 More on the v?space

We have seen that every positive primal{dual feasible pair (x; s) determines a unique 1 positive vector v via the relation v = (xs) 2 . In the rest of this paper we will frequently use the vector v belonging to a pair (x; s). Because this it not trivial we want to point out in this section the surprising and important fact that every positive vector1 v 2 IRn can be obtained in this way. In other words, the assignment (x; s) ! (xs) 2 de nes a one{to{one mapping from the set of all positive primal{ dual feasible pairs (x; s) onto the positive orthant of IRn . This can be understood as follows. Let v be any positive vector in IRn , and consider the weighted logarithmic barrier function n X T h(x) := c x ? vi2 ln xi i=1

for the primal problem. The gradient and the Hessian of h(x) are given by

rh(x) = c ? v x? ; r h(x) = V X ? : Since the Hessian is positive de nite, h(x) is strictly convex, and hence (due to the 2

2

1

2

2

basic assumption) has a unique minimizing point. The rst order conditions for this point are

Ax = b; x  0; c ? v x?1 = AT y; 2

for some y 2 IRm . Substituting s = v 2x?1 this system can be rewritten as

Ax = b; x  0; = c; s  0; xs = v 2 :

AT y + s

We know that this system has a unique solution, (x; y; s) say. Clearly the pair (x; s) is mapped onto v by . Recall that x lies on the central path of (P ) if and only if for some dual feasible y the slack vector s := c ? AT y is such that xs is a scalar multiple of the all one vector e. In that case y lies on the central path of (D). We call the pair (x; s) centered

7

if this happens. Clearly, the pair (x; s) is centered if and only if v = (x; s) is a scalar multiple of the all one vector e. Thus we conclude that the straight half line e;  > 0; in the v?space represents the centered pairs. Also note that if v = (x; s), then kv k2 = xT s. So in the v ?space the points with constant norm represent all positive feasible primal{dual pairs with a xed duality gap.

3 The Primal{Dual Ane Scaling Algorithm

Taking for x and s the displacements according to a full Dikin step, we can now describe the primal{dual ane scaling method.

Primal{Dual Ane Scaling Algorithm Parameters

" is the accuracy parameter; is the 'step size' (default value 1=(16pn)).

Input

(x0; s0): the initial pair of interior feasible solutions;

begin

x := x0; s := s0; while xT s > " do x := x + x; s := s + s;

end end.

3.1 The step size We proceed by investigating which step size in the described direction can be made without becoming infeasible. So, we want to know for which the new iterates x + x and s + s are positive. For this purpose we need the following notations:

vmax := max v; i i

vmin := min v: i i

Note that this is equivalent to vmax = kv k1 and vmin = kv ?1 k?11 . The next lemma gives a simple condition on the step size which guarantees that the new iterates are feasible.

8

min then x + x > 0 and s + s > 0. Lemma 3.1 If < vvmax

Proof:

Using that x = dv and x = dpx , the next iterate in the x-space is given by

x + x = d(v + px ) = dv(e + v ?1 px ): Since dv = x > 0, the positivity of the new iterate in the x{space is maintained if kv ?1 px k < 1. This will certainly be true if

kpx k < vmin : Similarly, the positivity of the new iterate in the s{space is maintained if

kpsk < vmin : Now using that

kpv k = kpxk + kpsk ; 2

2

2

it becomes clear that the next iterates will be nonnegative if

kpv k < vmin: Since

kpv k = kkvv kk  vmax; 3

2

the latter inequality will certainly hold if

vmax < vmin :

2

This implies the result of the lemma.

The above lemma makes clear that a full Dikin step (with = 1) is feasible whenever the vector v is a scalar multiple of the all one vector, i.e., if the pair (x; s) is perfectly centered.

4 Reduction of the duality gap In this section we consider the e ect of a Dikin step, with step size on the duality gap. We will use the superscript  to refer to entities after a step. So we have

x = x + x = d(v + px ); s = s + s = d?1 (v + ps ); (v )2 = x s = (v + px )(v + ps ):

9

Lemma 4.1



(x)T s  1 ? p



T n x s:

Proof:

Recall that

xT s = eT v 2 = kvk2:

The duality gap (x )T s after a Dikin step, with step size , is given by

kvk = eT v + eT v(px + ps) + pTx ps: Since px and ps are orthogonal, and px + ps = pv we obtain   Tv e k v k  T  T (x ) s = e v ? kv k = kv k ? kv k = 1 ? kv k xT s: 2

2

2

4

2

2

2

2

2

2

Using the inequality of Cauchy{Schwarz one gets

pnkv k = kekkv k  eT v = kvk : 2

2

2

2

(1)

2

Hence the lemma follows.

5 The barrier function 5.1 De nition and derivatives In the analysis of the algorithm we will make use of the function f : IRn+ ! IR determined by n X f (v) := n ln kvk2 ? ln vi2 ? n ln n: i=1

Note that this function is homogeneous in v , and moreover, due to the arithmetic{ geometric inequality, that f (v )  0; with equality if and only if v is a scalar multiple of the all one vector e. Since f is constant on all rays emanating from the origin, it cannot be strictly convex. But this is the only deviation from being strictly convex as we will show now. To this end we calculate the gradient and the Hessian of f (v ): rf (v) = 2n v ? 2v?1;

kvk r f (v) = k2vnk I + 2V ? ? k4vnk vvT : One easily veri es that v T r f (v )v = 0, which is in accordance with the behaviour on the rays just mentioned. Now let w be any vector in IRn which is orthogonal to 2

2

2

2

4

2

10

v. Then we have wT r2 f (v)w  0, with equality if and only if w = 0, as can easily be veri ed. This implies that the matrix r2f (v ) is positive semide nite. In the proof of the next lemma, and also in the rest of the paper we will make use of the vector w de ned by 2 nxs : = w := knv 2 vk xT s Note that eT w = n, and hence we have

kwk  n:

(2)

2

In fact this inequality is a restatement of (1).

Lemma 5.1 If v is no scalar multiple of e then pv is a descent direction for f . Proof:

One has

 T v3 ?1 (rf (v ))T pv = ? k2vnv ? 2 v 2 k kv2k

= = = =





T 2 T 4 2 e v2 ? ne2 v 2 kvk kvk kv k  2 eT v 2 ? neT v 4 kv2k  kvk2 2kv k2 1 ? nkv 2 k2 kv2k kvk4 2 ?n ? kwk2 : kwk

Because of (2) we conclude that (rf (v ))T pv  0, with equality if and only if w, and hence v , is a scalar multiple of e. 2

5.2 Bounds for the barrier function value In this section we derive an upper and a lower bound for the value of f (v ) in terms of the norm of the vector w. Recall that w belongs to the simplex n := f 2 IRn : eT  = ng: The deviation of w from the all{one vector e will be measured by the quantity r

 := n ?n 1 (kwk2 ? n):

11

Observe that this number is nonnegative, and zero if and only if w, and hence also v, is a scalar multiple of the all one vector. So the number  can be considered as a measure for the proximity ofpthe pair (x; s) to the central path. In fact, one may easily verify that  is equal to n ? 1 tan , where  is the angle between w and e. The barrier function can nicely be expressed in terms of w as follows: n Y

f (v) = ? ln

i=1

!

wi :

At this stage the following lemma, which is a reformulation of a technical lemma from Schrijver ([25], page 192), becomes of interest. In this lemma B (e; r) denote the sphere in IRn with radius r and centered at e.

Lemma 5.2 Suppose that  < 1. If  runs through n \ B(e; kw ? ek) then Q  :=      n is minimal (maximal) if and only if one of the coordinates i attains its 1 2

minimal (maximal) value and the remaining coordinates of  are mutually equal. Furthermore, the maximal value of the coordinates of  is 1 +  , and the minimal value is 1 ?  .

One easily veri es that if one of the coordinates of  (as de ned in the lemma) assumes its maximal value 1 +  then the remaining coordinates are all equal to 1 ? =(n ? 1). Similarly, if one of the coordinates of  assumes its minimal value 1 ?  then the remaining coordinates are all equal to 1 + =(n ? 1). Thus it follows that   n?1 n?1 n Y   (1 ?  ) 1 + n ? 1  wi  (1 + ) 1 ? n ? 1 : Taking logarithms, we obtain

i=1

Lemma 5.3 One has



     ln(1 ?  ) + (n ? 1) ln 1 + n ? 1  ?f (v )  ln(1 +  ) + (n ? 1) ln 1 ? n ? 1 :

Theorem 5.1 If  < 1 then one has 2(1+ )  f (v)  1 ?  . 2

2

2

Proof:

The proof of the rst inequality goes as follows. >From the previous lemma we deduce that    f (v)  ? ln(1 + ) ? (n ? 1) ln 1 ? n ? 1 :

12

Now using that for x > 0:

? ln(1 ? x)  x; we derive

? ln(1 + x)  ?x + 2(1x+ x) ; 2

2 2 f (v)  ? + 2(1+ ) + (n ? 1) n ? 1 = 2(1+ ) :

The second inequality is obtained in a di erent way. Using the previous lemma we write 

f (v)  ? ln(1 ? ) ? (n ? 1) ln 1 + n ? 1 n?1  = ? ln(1 ?  ) ? ln 1 + n ? 1  ? ln(1 ? ) ? ln(1 + ) = ? ln(1 ?  2): Now using that for x > 0: 2 ? ln(1 ? x)  x + 2(1x? x)  1 ?x x ;



2

the lemma follows. We derive another interesting result from the above technical lemma. De ne

! := vvmax : min

The relevance of this number is clear from Lemma 3.1 which implies that the step size is feasible if  ! ?1 . Clearly, ! 2 = wwmax min . The next lemma is an obvious consequence from Lemma 5.2.

Lemma 5.4 One has !  11 ?+  . 2

5.3 Reduction of the barrier function In this section we consider the e ect of a Dikin step with step size on the barrier function. We will make use of the following simple lemma.

Lemma 5.5 Let h be a vector in IRn such that khk < 1. Then n X i=1

ln(1 + hi )  eT h + khk + ln(1 ? khk):

13

Proof:

Using that jhi j < 1; 1  i  n; we may write

?

n X i=1

ln(1 + hi ) = ?

n  X i=1

 3 2 h h i i hi ? 2 + 3 ? : : :

 n  X k h k h h i i = 2 ?i 3 ? 4 +:::  ?eT h + kh2k + kh3k + kh4k + : : : = ?eT h ? khk ? ln(1 ? khk):

?eT h +

2

3

4

=1

2

3

4

2

This implies the lemma. As before, we will use the superscript  to refer to entities after a step. De ning f ( ) := f (v ) ? f (v ); we will prove

Lemma 5.6 Let  < 1 and let ! be as de ned before. If the step size is such that ! < 1, then





2 2 2 2 2 + ! : f ( )  ? p ? 1 +  n 2 n ? 1 2(1 ? !)

Proof:

Recall that our choice of guarantees the feasiblity of the new iterates. We may write  2 2 f ( ) = n ln kkvv kk2 ? eT ln (vv 2) : Recall that

(v )2 = (v + px )(v + ps );

kvk2 = Hence we obtain



 k v k 1? kvk kvk : 2

2

2

 k v k T ? T ? f ( ) = n ln 1 ? kvk ? e ln(e + v px) ? e ln(e + v ps ): Yet we apply Lemma 5.5 to the concatenation of the vectors v ? px and v ? ps . 

2

1

2

1

1

Note that this lemma applies, because

2 2 2 kv?1 px k2 + 2kv ?1ps k2  v 2 (kpxk2 + kpsk2) = v 2 kpv k2  2 !2 < 1:

min

min

14

1

Lemma 5.5 gives 

 k v k f ( )  n ln 1 ? kv k ? eT v ? pv ? ! ? ln (1 ? ! ) : Using the de nition of w, this reduces to   f ( )  n ln 1 ? n kwk + kn wk ? ! ? ln (1 ? !) : 2

1

2

Since ln(1 ? x)  ?x ? x22 when x > 0, this can be further reduced to 

 2 2 f ( )  n ? n kwk ? 2n2 kwk + kn wk ? ! ? ln (1 ? !) : The de nition of  implies that   2  2 kwk = n 1 + n ? 1 :

Substitution yields,   2 2  ? ! ? ln (1 ? !) f ( )  ? kwk ? 2 1 + n ? 1 + kn w k     2 2 n  = ? kwk ? kwk ? 2 1 + n ? 1 ? ! ? ln (1 ? ! ) : Using that  < 1, the coecient of in the rst term can be reduced as follows:

kwk ? kwn k = =

s 

 2  n 1+ n?1 ? r

r 

n2 n?1 2 n?1





n

n 1 + n?21 2pn



=p (n ? 1)(n ? 1 +  2 )

n 1+ 2 2  p  2  p n : n?1+

Finally, using the inequality ? ln(1 ? x) < x + 2(1x?2 x) ; x > 0, we may write

? ! ? ln(1 ? !)  ? ! + ! + 2(1 ?! !) = 2(1 ?! !) : 2

2

2

2

2

Hence the lemma follows.

15

6 Convergence analysis Theorem 6.1 Suppose that the initial value v = (x s ) 21 of the vector v is such that f (v )  , and that the step size at each iteration equals = pn . Then after 0

0

0 0

1 16

1 12



0 T 0 O n ln (x ) s





iterations the algorithm has generated a feasible primal{dual pair (x; s) such that xT s  . Moreover, this pair is almost centered.

Proof:

For the proof it suces to prove the claim that the property f (v )  121 is maintained in the course of the algorithm. This claim will imply the theorem, as we now rst show. By Lemma 4.1, in each iteration the duality gap is reduced with a factor not greater than 1 ? p n :

Hence the iteration bound in the theorem is a direct consequence of the speci ed step size. Also note that if f (v )  121 , then Theorem 5.1 implies that the proximity measure  will satisfy 2  1 : 2(1 +  ) 12 >From this one easily derives that   0:5. Hence, it follows from the above claim that all iterates will be almost centered. So it remains to prove the claim. In the proof below we will assume that n  4. This technical assumption simpli es the proof, but is not essential. Suppose that v denotes an iterate, and that f (v )  121 . We shall show that the next iterate, represented by v , also satis es the claim. We have

f (v )  f (v) + f ( ); and Lemma 5.6 gives an upper bound for f ( ). Instead of this upper bound we

will use the weaker bound

pn ? 2 + 2(1 ?! !) : f ( )  ?  >From   0:5 we derive, by using Lemma 5.4,   1 + 21 = 3: ! 2  11 + 1 ? 2 2

2

2

p

Hence !  3. Consequently,

p 1 : !  16 1 2 3  18 16

2

We rst consider the case that   0:1: In that case one has f ( )  0 (whence f (v )  f (v)  121 ), as we will now show. Substitution of the obtained values yields that

  p p ! p f ( )  ?2 ? n + n 1 ? ! 2 n  1 +  18 1 1  32n ?2 ? 16 + 16  17 1 ?  : 2

2

2

One may easily verify that the expression between the brackets is monotonically decreasing if 0:1    0:5, and that it is negative if  = 0:1. This proves the claim that f ( )  0 if   0:1: We proceed by treating the case that  < 0:1: First we derive from Theorem 5.1 that 2 1: f (v)  1 ?  2 < 99 Now consider f ( ). The rst term in the above bound for f ( ) being nonpositive, we certainly have 2 2 f ( )  ! : 2(1 ? ! ) Allthough we have a much sharper bound for ! in the present case, it suces to use the above bound !  181 , which holds for all   0:5. Substitution of this value gives f ( )  2  171  18 :

As a consequence we will have 1 + 1 1: f (v  )  99 = 0 : 011735 < 2  17  18 12 This completes the proof.

2

References [1] Adler, I. and Monteiro, R. D. C. (1992), A Geometric View of Parametric Linear Programming, Algoritmica 8, 161{176. [2] Dikin, I.I. (1967), Iterative Solution of Problems of Linear and Quadratic Programming, Doklady Akademiia Nauk SSSR 174, 747{748. [3] Goldfarb, D. and Todd, M.J. (1989), Linear Programming, in: Handbooks in Operations Research and Management Science, Optimization, 1, 141{170, G.L. Nemhauser, A.H.G. Rinnooy Kan and M.J. Todd, eds., Amsterdam.

17

[4] Goldman, A. J. and Tucker, A. W. (1956), Theory of Linear Programming, in Linear Inequalities and Related Systems (H. W. Kuhn and A. W. Tucker, eds.), Annals of Mathematical Studies, No. 38, Princeton University Press, Princeton, New Jersey, 53{97. [5] Gonzaga, C.C. (1991), Search Directions for Interior Linear Programming Methods, Algoritmica 6, 153{181. [6] Gonzaga, C.C. (1989), An Algorithm for Solving Linear Programming Problems in O(n3L) Operations, in: Progress in Mathematical Programming, Interior Point and Related Methods, 1{28, N. Megiddo, ed., Springer Verlag, New York. [7] Gonzaga, C.C. (1992), Path Following Methods for Linear Programming, SIAM Review 34, 167{224. [8] Greenberg, H.J. (1993), The Use of the Optimal Partition in a Linear Programming Solution for Postoptimal Analysis, University of Colorado at Denver, Denver, USA, preprint. [9] Guler, O. and Ye, Y. (1991), Convergence Behavior of Some Interior{Point Algorithms, Department of Management Sciences, The University of Iowa, Iowa City, Iowa. [10] Hertog, D. den (1993), Interior Point Approach to Linear, Quadratic and Convex Programming { Algorithms and Complexity. Kluwer Publishing Comp., Dordrecht, The Netherlands (forthcoming). [11] Hertog, D. den, Roos, C. (1991), A Survey of Search Directions in Interior Point Methods for Linear Programming, Mathematical Programming 52, 481{ 511. [12] Jansen, B., Roos, C. and Terlaky, T. (1992), An Interior Point Approach to Postoptimal and Parametric Analysis in Linear Programming, Technical Report 92{21, Faculty of Technical Mathematics and Informatics, Technical University Delft, Delft, The Netherlands. [13] Karmarkar, N.K. (1984), A New Polynomial{Time Algorithm for Linear Programming, Combinatorica 4, 373{395. [14] Kojima, M., Mizuno, S. and Yoshise, A. (1989), A Primal{Dual Interior Point Algorithm for Linear Programming, in: Progress in Mathematical Programming, Interior Point and Related Methods, 29{47, N. Megiddo ed., Springer Verlag, New York. [15] Kranich, E. (1991), Interior Point Methods for Mathematical Programming: a Bibliography, Diskussionsbeitrag Nr. 171, Fern Universitat Hagen, Hagen, Germany. (An Updated Version is Available on Netlib.)

18

[16] Lustig, I.J., Marsten, R.E. and Shanno, D.F. (1991), Computational Experience With a Primal{Dual Interior Point Method for Linear Programming, Linear Algebra and Its Applications 152, 191{222. [17] McShane, K.A., Monma, C.L. and Shanno, D.F. (1989), An Implementation of a Primal{Dual Interior Point Method for Linear Programming, ORSA Journal on Computing 1, 70{83. [18] Megiddo, N. (1989), Pathways to the Optimal Set in Linear Programming, in: Progress in Mathematical Programming, Interior Point and Related Methods, 131{158, N. Megiddo ed., Springer Verlag, New York. [19] Mehrotra, S. (1990), On the Implementation of a (Primal{Dual) Interior Point Method Technical Report 90{03, Department of IE/MS, Northwestern University, Evanston IL. To Appear in SIAM Journal on Optimization. [20] Mizuno, S. and Nagasawa, A. (1992), A Primal{Dual Ane Scaling Potential Reduction Algorithm for Linear Programming, Research Memorandum No. 427, Department of Prediction and Control, The Institute of Statistical Mathematics, Tokyo, Japan. [21] Monteiro, R.D.C. and Adler, I. (1989), Interior Path Following Primal{Dual Algorithms, Part I: Linear Programming, Mathematical Programming 44, 27{ 41. [22] Monteiro, R.D.C., Adler, I., Resende, M.G.C. (1990), A Polynomial{Time Primal{Dual Ane Scaling Algorithm for Linear and Convex Quadratic Programming and its Power Series Extension, Mathematics of Operations Research 15, 2, 191{214. [23] Renegar, J. (1988), A Polynomial{Time Algorithm Based on Newton's Method for Linear Programming, Mathematical Programming 40, 59{94. [24] Roos, C. and Vial, J.{Ph. (1992), A Polynomial Method of Approximate Centers for Linear Programming, Mathematical Programming 54, 295{305. [25] Schrijver, A. (1986), Theory of Linear and Integer Programming. John Wiley and Sons, New York. [26] Sonnevend, G. (1986), An "Analytical Centre" for Polyhedrons and New Classes of Global Algorithms for Linear (Smooth, Convex) Programming, Lecture Notes in Control and Information Sciences 84, 866{878. [27] Todd, M.J. (1989), Recent Developments and New Directions in Linear Programming, in: Mathematical Programming: Recent Developments and Applications, M. Iri and K. Tanabe, eds., Kluwer Academic Press, Dordrecht, Holland, 109{157. [28] Tsuchiya, T. (1991), Global Convergence of the Ane Scaling Methods for Degenerate Linear Programming Problems, Mathematical Programming 52, 377{404.

19

[29] Tsuchiya, T. and Muramatsu, M. (1992), Global Convergence of a Long{ Step Ane Scaling Algorithm for Degenerate Linear Programming Problems, Research Memorandum No. 423, The Institute of Statistical Mathematics, Tokyo, Japan. [30] Yamashita, H. (1986), A Polynomially and Quadratically Convergent Method for Linear Programming, Mathematical Systems Institute, Inc., Tokyo, Japan.

20