Global Convergence Property of the Affine Scaling ... - Semantic Scholar

Report 2 Downloads 36 Views
MATHEMATICS OF OPERATIONS Vol. 17, No. 3, August 1992 Printed in U.S.A.

RESEARCH

GLOBAL CONVERGENCE PROPERTY OF THE AFFINE SCALING METHODS FOR PRIMAL DEGENERATE LINEAR PROGRAMMING PROBLEMS*t TAKASHI TSUCHIYA In this paper we investigate the global convergence property of the affine scaling method under the assumption of dual nondegeneracy. The behavior of the method near degenerate vertices is analyzed in detail on the basis of the equivalence between the affine scaling methods for homogeneous LP problems and Karmarkar's method. It is shown that the step-size 1/8, where the displacement vector is normalized with respect to the distance in the scaled space, is sufficient to guarantee the global convergence for dual nondegenerate LP problems. The result can be regarded as a counterpart to Dikin's global convergence result on the affine scaling method assuming primal nondegeneracy.

0. Introduction. Since Karmarkar[9] proposed the projective scaling method for linear programming in 1984, a number of interior point methods have been proposed and implemented. The primal affine scaling method for the standard form linear programming problems [4, 6, 19] and the dual affine scaling method [1] for the dual standard form linear programming problems are simplified variants of Karmarkar's method, in which the affine scaling transformation is used in place of the projective transformation. Since the methods are simple and conceptually easier to understand, many researchers are dealing with their implementation and now several promising results [1, 2, 10, 12] are reported. Though the primal and the dual affine scaling method apply to different forms of linear programming problems and their iterative formulae appear different, we can derive one from the other, so that they are mathematically equivalent. As opposed to the simplicity of the methods and the promising experimental results, there are several basic questions to be answered from the viewpoint of theory. One of the most interesting open problems may be whether or not they are polynomial-time algorithms. It was shown by Megiddo and Shub [11] that the trajectory of the continuous version of the primal affine scaling method may visit the neighborhoods of all the vertices of the Klee-Minty cube, and hence many of the researchers feel that the methods do not have the polynomial property. The global convergence property is a weaker, but a more fundamental property that should be guaranteed in any algorithm for continuous optimization. However, the existence of degenerate problems makes the problem difficult, and hence still no proof is given for the global convergence of these affine scaling methods in the case where both primal and dual degeneracy take place. All the known global convergence results are dealing with the primal affine scaling method and assume primal nonde*Received September 26, 1989; revised September 27, 1990. AMS 1980 subject classification. Primary: 90C05. IAOR 1973 subject classification. Main: Programming: Linear. OR/MS Index 1978 subject classification. Primary: 642 Programming/Linear. Key words. Linear programming, interior point methods, Karmarkar'smethod, affine scaling method, global convergence property, degenerate problems. tThis paper is an improved English version of the paper presented under the same title in Japanese at the group meeting of the Research Group of Mathematical Programming of the Operations Research Society of Japan (June 24, 1989, held at The Institute of Statistical Mathematics, Tokyo, Japan).

527 0364-765X/92/1703/0527/$01.25 Copyright ? 1992, The Institute of Management Sciences/Operations

Research Society of America

528

TAKASHI TSUCHIYA

generacy condition [7], ([20] is a detailed and elucidative paper on Dikin's proof [7] in English), or both primal and dual nondegeneracy conditions [4, 19]. In [3], global convergence was shown for the continuous version of the method without assuming nondegeneracy conditions. In this paper we give a proof for the global convergence property of the dual affine scaling method which requires only dual nondegeneracy condition. The behavior of the method near degenerate vertices is analyzed in detail on the basis of the equivalence between the affine scaling methods for homogeneous linear programming problems and Karmarkar's method [5, 8, 15, 16, 21]. Applying the technique developed and used for the local analysis of several interior point methods under the existence of degeneracy [17, 18] and the global analysis of dynamical systems related to Karmarkar's method [14], we show that the stepsize 1/8, where the displacement vector is normalized with respect to the distance in the scaled space, is sufficient to guarantee the global convergence for dual nondegenerate linear programming problems. Though our result is for the dual standard form set-up, it can be directly applied to show the global convergence of the primal affine scaling method under the assumption of dual nondegeneracy as well. In this sense our result can be regarded as a counterpart to Dikin's result [7], which proved the global convergence of the primal affine scaling method under the assumption of primal nondegeneracy. 1. Problem. In this paper we deal with the dual standard form linear programming problem (D): minimize ctx, subject to x E 49, = {x E RnAtx - b 0},

(1.1) A

=(al,...,

am) e RnXm,

cE R,

b

R

We put the following assumption throughout the paper. (1) The feasible region 4 has an interior point and Rank(A) = n. This is a conventional assumption assumed in most of the literatures of the interior point methods. In addition, we require the following two assumptions in Lemma 6.1 and Theorem 2.1, the main theorem of this paper: (2) The set of the optimal solution(s) of (D) is nonempty; (3) The objective function is not constant on any face of 9 except for the vertices. In particular, assumption (3) plays a substantial role in this paper. This assumption is sometimes referred to as "the assumption of dual nondegeneracy". We prove the global convergence of the dual affine scaling method [1] under assumptions (1)-(3). To this end, we proceed as follows. In ?2 we introduce the dual affine scaling method and explain our main result in detail. In ?3, we provide several notations and propositions regarding the elementary theory of polyhedra, and introduce a new coordinate by choosing independent slack variables. Section 4 is devoted to derive an asymptotic formula of the projection matrix which appears in the iterative formula of the dual affine scaling method represented in the space of slack variables. In ?5 we introduce the concept of local Karmarkar potential functions which are associated with dual degenerate faces, and prove a key lemma of this paper that relates the dual affine scaling method and the reduction of the local Karmarkar potential functions. In ?6, making use of the assumption of dual nondegeneracy, we observe that the sequence converges to a vertex. Amalgamating the results obtained in ??5 and 6, we prove in ?7 that the limiting vertex is the optimal solution, which proves global

GLOBAL CONVERGENCE

PROPERTY

529

convergence of the affine scaling method under the assumption of dual nondegeneracy. Section 8 contains concluding remarks. Assumptions (2) and (3) are used in ??6 and 7 only, and the lemmas and propositions in ??3, 4 and 5 hold regardless of these assumptions. Since we want to present our results as generally as possible, in principle, we carry out the analysis without requiring assumptions (2) and (3), and we explicitly indicate assumptions (2) and (3) in the statement of the lemma and the theorem which require these conditions. Before proceeding to the next section, we introduce basic notations. For a vector v, we denote by [v] the diagonal matrix whose diagonal entries are elements of v. We denote the slack variables Atx - b by ((x), and define the "metric" matrix G(x) for the affine scaling method as follows: (1.2)

G(x) =A[5(x)]

2At.

1 and I denote the vector of all ones and the identity matrix of proper dimension, respectively. We use 11- 1 (without subscript) for the 2 norm. For the sequence {x(^)} (v = 1,...; x( Ee Rn), we abbreviate {f(x(^))}, {g(x()))}, etc. as {f(v)}, {g()}, etc. We denote by x+ the new point obtained by performing one iterative step at the point x E Rn, and use f+, g+, etc. to denote f(x+), g(x+), etc. We do not indicate arguments of functions when they are obvious from the context. 2. The dual affine scaling method and the main result. Let x(" be an interior point of the polyhedron '. In the dual affine scaling method, we determine the "new approximate solution" x(+ 1) according to the following formula:

(2.1)

x(v+ ) =x()

()

c

G(xP)) {ctG(x(v)

)-C

As we will show in Lemma 2.2, x(+ ) is also an interior feasible solution if we choose 0 < (v) < 1, so that the iteration can be continued recursively. Since G(x) is a positive definite matrix, we easily see that the method is a descendent method for ctx. If we assume assumptions (2) and (3), the set {x E 6lctx < ctx(} is compact, and hence the sequence generated by (2.1) always has an accumulation point. The main result of this paper is written as follows: THEOREM 2.1. Let (D) be a linear programming problem satisfying assumptions (1)-(3). If we choose the step-size A() = 1/8 and apply the dual affine scaling method (2.1) to (D), the generated sequence converges to the optimal solution which is a vertex

of 9.

We will prove this theorem in ?7. Some readers may be confused in comparing this result with the existing results on the global convergence property given by Dikin [7], Barnes [4], and Vanderbei et al. [19], since we deal with the dual affine scaling method for the dual standard form linear programming problems while the others work with the primal affine scaling method for the standard form linear programming problems. Barnes [4] and Vanderbei et al. [19] independently proved the global convergence of the primal affine scaling method in the case where both primal nondegeneracy condition and dual nondegeneracy condition are satisfied. The best convergence result obtained so far is the one by Dikin [7], who proved the global convergence of the primal affine scaling method under the assumption of primal nondegeneracy. Though our result is

530

TAKASHI TSUCHIYA

for the dual standard form linear programming problems, it can be directly applied to show the global convergence of the primal affine scaling method under the assumption of dual nondegeneracy. (Strictly speaking, our condition on dual nondegeneracy is slightly weaker than the one in [4] and [19].) Hence, our result may be regarded as a counterpart to Dikin's convergence result. A brief discussion of this point is given in the Appendix. Before concluding, we prove Lemma 2.2. LEMMA2.2.

If x(L) is an interiorpoint of 9 and 0 < A() < 1 in the iteration (2.1),

x(+?l) is also an interiorpoint off. PROOF. Let us denote x(v), x(++1), ,(A) by x, x+, ,L. Then (2.1) is written as

(2.2)

G(x) ) -c

x (2.) x -

{ctG(x)-

1 /2

c}

In terms of the slack variables, we have -c C-At { G(x)

(2.3)

(c'G(x)c

1/2

where s = s(x), 5+= ((x+). Obviously, "x+ is an interior point of 9 if and only if []-~l +> 0." We represent the iteration in the space of the scaled slack variables as follows:

(2.4)

1 -AG(x)

[-]+=

1/2

{c'G(x),c

From the definition of G, we see that (2.5) (2.5)

c[]

IA'G(x)

'c

1.

(ctG(x)-Ic)1/2

This means that [f]-1+ is a point on the surface of the ball of the radius / centered D at 1. Hence AL< 1 implies []-~l +> 0, so that x+ is also an interior point of 9. 3. Preliminaries. In this section we introduce further notations regarding polyhedra, together with preliminary propositions and concepts which will be used in the remainder of this paper. ? to denote the faces of a. We do not treat the empty set (1) We use X, s ,..., as a face. We denote by v the set of the optimal solution(s) of (D), if it exists. If assumptions (2) and (3) are satisfied, Y is a vertex of ,9. For a face S" of ?, we denote by E(g') the set of indices of the constraints which are always satisfied with ' which equality on the face. We sometimes abbreviate E(Sg) as E when the face associates with the notation E is obvious from the context. (2) Given a set X c {1,..., ml of indices, we denote by Ax, bx the matrix and the vector composed of the corresponding coefficient vectors and constants. We use (x(x) for A'x - bx. Analogously, for a vector u, we denote by vx the vector which is composed of the part of v associated with X. We introduce the following set which

531

PROPERTY

GLOBAL CONVERGENCE

is naturally determined from X: (3.1)

T(X)

= {x E

|IAx,x - bx = 0}.

Note that T(0) is .9 itself, and that T(X) may be empty if X is chosen arbitrarily. If T(X) 4 0, then T(X) is a face of 9. Conversely, given a face q of 4, there always exists an index set X' with which X can be represented as S= T(X'). (Often X' is not unique.) (3) A point x on a face S9 of 3a is referred to as an "interior point of .9" if = 0 and (i(x) > 0 (i e E(g')). The interior point of a vertex is the vertex (E(t)(X) itself. The face .9 is characterized as the smallest face (as a set) among the faces which contain the point x as their element. (4) For an index set X, we use IXI to denote its cardinality. If X is a (proper) subset of another index set Y, we denote X c (c)Y. Then we denote by X - Y the set which consists of the indices which belong to X but not to Y. The complement of X, which is defined as {1,..., m} - X, is written Xc. Let Z be an index set of constraints. We can choose an index set B c Z such that the columns of AB are a basis for the range space of Az. Since Rank(A) = n, due to the elementary theory of linear algebra, we can choose another index set B from the complement

of Z such that A BU forms a nonsingular

matrix. Then

B UB is

regarded as another coordinate than x, where the coordinate transformation is given by (3.2)

BUB(X) = ABBx BU X(BUB)

- bBU BUB, B + bBUB)

= (AUB) (uBU1

We refer to the pair (B, B) as a "pair of basis index sets associated with the index set Z". In this paper we use the letters B and B as the notation for such pairs of basis index sets. When we want to make clear that the pair is associated with the index set Z, we write them as B(Z)

and B(Z).

We refer to (sB(z), sB(z))

as the "slack

coordinate associated with the index set Z". We denote by R(Z, B) the index set Z - B. Due to the definitions, there exists a constant matrix TBR such that AR = ABTBR.

(3.3)

Thus the index set Z and its associated pair of basis index sets (B, B) determined, we define the matrices AB(Z) and AB(z) as

(3.4)

(3.4= A

B(Z)

AB(Z),

)_l A

Ap(z) B(Z)UB(Z)'

Then, we have

(3.5)

AB(Z)AB(Z)

AB(Z)AB(Z)

AB(Z)AB(Z)

AB(Z)AB(Z),

AB(Z)AB(Z)

+ AB(Z)AB(Z)

I

I.

O\

532

TAKASHI TSUCHIYA

Note that AB()AB(Z) and AB(Z)AB(Z)are projection matrices. The objective function ctx can be written as t t = C~A.t

(3.6)

ctx

(3.6)

C X = CjAB(Z)(B(Z) + bB(Z)) + CtAB(Z() (z

bt

tIA~7'r (5B

+ b(z))

in terms of the coordinate (SB(z), B(Z))? With these notations, all the constraints can be categorized into four groups: AtR(Z,B)x - b R(Z, B)

(3.7)

,(x)

=Ax

-b=

B(Z) A(Z)X N(ZB)X

-

B(Z)

(Z) bN(Z,B)

'R(x) =

B(X)

B(x) N()

where N(Z,B)

= {1,...,

m

-Z-B=

{1,...,

m -R-BUB.

We also use R and N as global notations in this paper. We omit the arguments (Z, B) of R and (Z, B) of N if they are obvious from the context. Let Y be a face of ?. As defined above, the index set E(F-) consists of indices which are always active on Y. Due to the basic theory of polyhedra [13], we have the following proposition in the special case where Z is taken to be E(~F). PROPOSITION 3.1. Let F be a face of 9. Let (B, B) be a pair of basis index sets associated with the index set E(F). We can representAR and 6R(x) as (3.8)

AR =ABTBR BR(x) and

= TBR6B(x)

with an appropriatematrix TBR,and hence F-= P(B). A face F of 4 is referred to as a "dual degenerate face" if the objective function ctx is constant on the face. We include vertices also as dual degenerate faces. Dual degenerate faces are characterized as follows. 3.2. A face F of ~ is a dual degenerate face if and only if PROPOSITION c E Im(AE(y)). PROOF. It is an easy exercise, hence we omit the proof. El By definition, the set of the optimal solution(s) is a dual degenerate face. We note that a dual degenerate face does not necessarily contain an optimal solution. Any face S of 5 that is contained in a hyperplane {xlctx = cO} with appropriate co is a dual degenerate face. For example, every nonoptimal vertex is a dual degenerate face. In terms of the notations introduced above, the assumptions on degeneracy concerned with the linear programming problem (D) is written as follows: (1) Assumption of primal nondegeneracy: For each face F of ., there exists no redundant constraint which is always active on the face. In other words, E(5F) = E. is the unique index set such that T(E(Y)) (2) Assumption of dual nondegeneracy (assumption (3) of this paper): 9 has no other dual degenerate face except for its vertices, i.e., c 4 Im(AE(,)) for each face 9 of ?? except for the case where 9y is a vertex.

GLOBAL CONVERGENCE

533

PROPERTY

We do not require (1) in this paper, and will prove in ?7 the global convergence under assumption (2). 4. Asymptotic formula of the projection matrix associated with the dual affine scaling method. In ?2 we mentioned that iteration (2.1) is written as (2.3) in the space of slack variables. We easily see that formula (2.3) may be represented as follows: = Ox) - A[6x] x)a(x) O +x) = sC(x+) sO(x) - 1z[SC(x)] P((ax)P(x)a(x)

(4.1)

{/2'

where u. is the step-size, P(x)=

1,

lAtG(x)-lA[6(x)]

[=(x)]

(4.2) a(x)

= [6(x)]y.

Here y is a vector in Rm which satisfies the equality Ay = c. Note that P(x) is a projection matrix. Multiplying both sides of (4.1) by [s ]-1, we have (4.3)

1 -

[-ls+=

/2

(atpa)1/2

This means that the value of each slack variable is multiplied at most by a factor of (1 + ,L) at each step of iteration (2.1). In the analysis of the global convergence of the dual affine scaling method, it is important to investigate the asymptotic properties of the projection matrix P(x) when the sequence (or its subsequence) approaches a face or diverges to infinity. (Under assumptions (2) and (3), it easily follows that the sequence is bounded and every accumulation point is on a face. But, for further extensions, we consider the general case here.) The following lemma gives a useful expression of P(x) for our purpose. LEMMA 4.1. Let F be an index set, and choose a pair of basis index sets (B(F), B(F)) associated with F to take the slack coordinate (GB(F),6B(F)). Let x be an interiorpoint of 9 and let the slack variable ((x) be put in order as

= (F(X), x(X) xB(F)(

), N(F)(W ) = ( R(F)(X),IB(F)(X),

xB(F)(X), xN(F))).

Then the matrix P(x) is written as follows. (A matrix with a pair of index sets as the lower indices (say, Cz z2) represents a matrix whose rows and columns are associated with the first index set (Z1) and the second (Z2), respectively. We use this convention throughout the paper.) F

(4.4)

B(F)U

N(F)

, ^

F

PFF-

QFF PF'F + (SN B(F) u N(F)

(^N

AQBB(I

I

SN)

P,

534

TAKASHI TSUCHIYA

where F

B(F)

0 0

0

PFFSFN

0

0

F AP=

B(F) N(F)

SSFN

O

FF

N(F)

SFNPFFSFN

F F

3(F)

N(F)

(?FB

0

QFFSFN + QFBSBN

0

-B(F)

QFBSFN SFNQFFSFN

N( (F)

St S

(F) QFF SFN

+

PFF

(

FNQFB

+ SN

+ SFNQFBS,BN,

BN QFB

I

PF'FC SBN

QFBSFN

BR S RR)I )

(

+

(I

+ SBNSBN)

1(SBR I),

(

SBN),

(4.5) QFF = PFFSFN(I --

+ SLBNSBN + SSFNPFFSFN)

QFB = PFFSFN(I

* PFF

+ SBNS BN+

I + PFFSFN (I

* SFN(I + SLNSBN)

SFNPFF SLNPF

S FNPFFSFN)

BN

SFNPFF

BNSBN)

PFF

SBN

SBR = [iB]ABAR[Ri,]

SFN=

[(G],BAnAf[N

SBN =[B]

i

RIF RI Rll cgI. FF EIFX,

QFF E RIFIxlFI,

SN

RtFIlNI,

ABAN[SN] A E PFF

E RIBUNIXIBu

QFB E RIFIxlBI, SBN E RIBIXINI

PFF(x) and PFCFC(x)are projection matrices.

AQ-BB

NI

E RIBIXIBI,

SBR E RIBIlRI

GLOBAL CONVERGENCE

535

PROPERTY

PROOF. The definition of P is [ A]-AtG- A[f]-l. We denote the index set F U B AB AB). Since Av is a full rank by V, and introduce the matrix Av =(AR matrix (recall that B U B is a pair of basis index sets), -A v[ v]A Gv(-r

(F,

B))

is also a nonsingular matrix. Putting SVN=

(4.6)

SFN N

we have (4.7)

=

A[f]-'

=

(Av[v]-'1

ANI[N]-1)

(AR[R]1

AB[IB]

I

AB[1B]

O O

.0

I

=AAG (AF[F-

0 [B]ABAAN[N]-

0

0

1)

I

1

=AV[~]-'(I

[ B1]ABAN[YN]

)

A-B[}B]

Jfi(

OI2

SBN

SVN).

By using the Sherman-Morrison-Woodbury formula (4.8)

"(A + UUt) -1A-l

-A-'U(I

+ UA -'U)-UA

-',"

G-1 is written as follows:

(4.9)

G Gv +-

-

GAv[v]

-A

'SVN

S )SVN =\(I? (IG5SN[] G[- Av[v 'AG ] 'AGV1AV[NV] SVN) -SVN(I + SN[v] IA

G1'A-[]

A'v G

[ 6v-,VN -',-,IS

-[fv]-'A?G'.

V

1

536

TAKASHI TSUCHIYA

Substituting (4.7), (4.9) into the definition of P, we obtain (4.10)

P= []-

lAt'G-lA[] ]

(St

SVNI

Pv(

-sN(I

+

)S

SNP

N)PVV(I

SVN),

where Pvv = [v] -'AGGv1Av[[v]-1' In the following we deal with the matrix Pvv. From (3.3) and (3.5), we see = [

[B] ABAR[fR]1

ABABTBR[SR]-1 = 0.

Taking note of this fact, we have (4.11)

= (A,[F]1

Av[]'

1)

AB[Bj

-(AB[1B]

AB[B]-1)

]1-' [IA,BA[ = [ AB]ABAR[R]

AB[BI ABIGY'[Aij]pl-1 (AB[5B}1

I

0'

O

I

1)(S

00 o I 01I)'

where SBR= [ B]ABAR[ R]-

(4.12)

Substituting (4.11) into the definition of Pvv and using the fact that 1-) = ABUB[ BUB]

(AB[B]-1AB[BI is an invertible matrix, we have Pvv = [v]

(4.13)

'AGvAv[v]-

= (FF

0),

where (4.14)

FF

(SBRSBR + I)l(SBR

(S

I).

Obviously PFF is a projection matrix. Substituting (4.13) and (4.6) into (4.10), we see P written as follows: F

(4.15) F P(x) =B(F)

0 I

N(F) 0

QFF

PFF U N(F)

B(F)U

(S

\SBN

(I - QBB)(I

SBN)

537

PROPERTY

GLOBAL CONVERGENCE

where AP is as defined in (4.5) and QBB = SBN(I + SNSBN + SFNPFFSFN)

(4.16)

SBN.

Now, it remains to show that

(4.17)

BN

)(I-

SN)

Q-)(I

=

SBN

+

PFF

(I

SN),

where PcFc and AQ-B are defined as in (4.5). Applying the Sherman-MorrisonWoodbury formula (4.8) to (I + SSNSN + SFNPFFSFN) , we have (4.18) (4.18)

(I + SBNSBN + SFNPFFSFN) + SNSBN)

(I

SNSBN

-(I+

SFN PFF(I

+ PFFSFN(I

+ SBN SBN)

SFN PFF)

-*PFFSF(I + S NSBN)

Substituting (4.18) into the definition of (4.19)

(I +S

NSN )-

QBB

and using the relation

= I - SN(

I + SBNSBN) 1SBN

which also follows from (4.8), we obtain (4.17), where PFCFC obviously is a projection matrix. This completes the proof. C REMARK. This lemma holds also in the special case where F is taken to be empty or the whole index set {1,..., m}, by neglecting in (4.4) the rows and the columns of which the associated index sets are empty, and, in (4.5), putting to be zero the matrices which contain in their definitions a matrix with the empty index set as its lower indices. We apply this rule throughout the paper. Then, we have P(x)= PFCFC(x) if F = 0, and P(x)

= PFF(x)

if F = {1,...,

m}.

In order to analyze the behavior of the interior point method in the asymptotic situations, we consider the quantity (4.20)

(F(X)

nF(x) = II~F(x)ll min si i0F

where F is an index set. Conventionally, we define DF(X) = 0 in the special cases where F = 0 or F = {1,..., m}. If this quantity is small for some F, it means that the constraints of (D) are categorized into two groups; the one consisting of the constraints Si (i E F) whose residuals are relatively small and the other consisting of the constraints si (i 4 F) whose residuals are large. We choose a specific index set F such that (IF(x) takes small value, and apply Lemma 4.1 in the consecutive sections. In fact, we have the following lemma.

538

TAKASHI TSUCHIYA

LEMMA 4.2. Let F be an index set, and choose a pair of basis index sets (B(F), B(F)) associated with F. We take the slack coordinate (GB(F)' 4B(F)). Then the norms of the matrices SFN, QFF, QFBR QFBSBN and (I SBN)t AQBB(I SRN), which are defined in Lemma 4.1, are bounded as follows:

(4.21)

(4.22)

ISFNII

= IIQFFII

?

(4.23)

IIABANIK'DF(X),

([[

PFFSFN(I

IISFNI2 =_0(((PF(

= IIQFBII

PFFSFN(I

+ StNPFFSFN)

NSN

+

StFN FF

X)22,

+

+ StNPFFSFN

NSN

N

< 2IISFNI(1+ IISFNII2) O(cF(X)), -

(4.24)

IIQFBSBNII

PFFSFN(I

+ SBNS,RN

? +

SPS FN

0, if S' is a dual degenerate face that does not contain the whole set of the optimal solutions of (D) and x is in the vicinity of the face S2. This fact is described more precisely as Lemma 5.1 shown below. LEMMA 5.1. Let (D) be a linear programming problem with a nonzero objective function (i.e., c # 0) satisfying assumption (1), and let a be a dual degenerate face which does not contain the whole set of the optimal solution(s) of (D), where the value of the objectivefunction is co. (In the case where the optimal solution does not exist, all the dual degeneratefaces of (D) satisfy this condition.) Let x be an interiorpoint such that ctx - co > 0, and denote by x+ the new iterate obtained by performing one iteration of the dual affine scaling method (2.1) with the step-size ,L. Then, if < 1/8 and (iii) ctx+- co > O, we have (i) C(E(_)(x) is sufficientlysmall, (ii) 0
0, the reduction of the Karmarkar potential

If AIK(IL, PEE, 3E) < 1 and ctx+-

function (5.1) is bounded from above as follows: (5.11)

+ A AE)

f/-(E

-f-(E) 3EPEEP E - l1/k

PROOF. Since 1 tEE

1

1

11PEE/E1- E/kll

EEE t/2 Et 1/2 - ~EE~f A/2

nt

I

K1-

AK

-i

+

2 1 -.K

1, we have

=

1 -//E 3E E,

(5.12)

-

1EPEEE

+

'E

E-PEEE PE

-

1/2 (t{15PEEIB}

On the other hand, the vector 1E - a,PEE,3E/{3EPEEE1/2 (5.13)

EE

1E -

is rewritten as

1/2

({[E3P15EEE /E}1 E

PE -

--

1/2 1E

~ {[3Et pEEE}E

k{[3{8PEE8E1}

iL_

R\

1-

L3

kiptBE -

k{3RtSPEEE}1/2

t k{3t5PEE3E}E

II:-EE (E PE

1E -

2

11E

IAL{K3PE

-

kE)

PEEriE

1/2

I

-II PE/3E 1E/k

PEE1E f}

-KiPEEP3E- 1E/kll

E

I E/kI

GLOBAL CONVERGENCE

543

PROPERTY

where (5.14)

AK

=

IIPEE3E - 1E/kIl.

1/2 -

{(E PEEE}

/k

Since x' is an interior point, we have P

1E -

1E -

1/2=

E > 0,

[E]

and hence (5.13) is strictly positive. Then, by using the condition /LK < 1, we see that the factor (1 - .t/(k{fPEEEfl}l/2)) in the right-hand side of (5.13) is strictly positive. Noting this fact, substituting (5.12) into (5.8) and using (5.13), we see that the reduction of the potential function is written as follows: (5.15)

-(E

) - f-(E)

a

+

' kEEPE - lE/k Ek 5 8iiEE-E - E/lki PEIIEEE3E

k log 1 glE11 -

E

P

1 -UK

log

, IIPEE8E

- l/k -

1E/kll

In the following we find a function with which (5.15) is bounded from above. For this purpose, we use the following two "well-known" inequalities concerning the logarithm function log(1 + 0)

(5.16) (5.17)

log(l

(0 > -1),

0 -

+ li)

+ 3 -

ik1 i=1

i= - -

> 7i

i-E1

>ti q -'2(1-

-

2 .I

2.

2(1 - 177i)

112 11)77

- 11) k)t

(T] = (T1i ,...,

E Rk, 1lr_l < 1).

Applying (5.16) and (5.17) to give an upper bound for the first and second terms of the left-hand side of (5.15), we see that (5.15) is bounded from above as follows, which is the desired result: (5.18)

(1)PEE,B- - 1/k (formula (5.15)) < --k 'KEPEEI3E - 1/k IIPEE8E LE/kiI - 1 tEPEE/3E 1E/kll 11KPEEP

This completes the proof.

o

1

A2

2 1A1

K

544

TAKASHI TSUCHIYA

Note that inequality (5.18) is different from the one used in evaluating the reduction of Karmarkar's potential function in the Karmarkar method only in the point that PEE is used in place of PEE. (The vector (PEEI3E -

E/k)/IIPEE/3E

lE/kl in (5.15) corresponds to the unit displacement vector of the Karmarkar method, and AxKis the associated step-size.) In the following we evaluate (5.11) in detail to obtain an upper bound of the reduction of the Karmarkar potential function which only depends on k, LLand We begin with the following lemma, which is known as a basic result of the IIQEEII. of theory duality and complementarity. LEMMA5.3. A face 9~ of Y is the whole set ? of the optimal solutions of (D) if and only if there exists a pair (F, y) of x E Fand an optimal solution y of the standard form dual problem of (D)

max bty,

(5.19)

subject to Ay = c,

y > 0,

satisfying the following strong complementarityconditions: (i) GE(S,)() = 0 and YE(y) > 0, (ii) s(f)i > 0 and Yi= 0 for each i e E(F). PROOF. The proof is left to the readers. (For example, the lemma can be derived as a direct consequence of the results of ?7.9 of [13].) 0 By using Lemma 5.3, we prove the following lemma, which is a substantial part of the proof of Lemma 5.1. LEMMA5.4. If (i) the dual degenerateface ` does not contain the whole set of the optimal solutions of (D), (ii) E(g) = k > 2 and (iii) ctx - co > 0, the norm of the at x is bounded from below as follows:

vector PE(,)E(s)PE({)

(E.PEEPE

(5.20)t PROOF.

of PEE3E and

k -1 1

ctx - co > 0, we may assume that the sign EEaE is the same. First we observe that PEEaE

From the assumption aclE

of each component

>

=

cannot be a strictly positive vector. Here we define y^ as follows: YE YE = [E'P^EEaE [(E] - Pa

(5.21) (5.21)

= [^r1^ [(E] =

PEE[6E]YE, ^]^

where the definition of a, y, etc. is the same as in the remark following (5.5) and (5.3). Using (4.11) and (4.12) (recall that we put F -= E(g) in applying Lemma 4.1), we see that AE[VE]-1

(5.22)

is written as

AE[SE]-' =AB[1S]B- (SBR

I).

Substituting definition (4.14) of PEE into PEEaE, we have (5.23)

AE9E=AE[fE] -AE[E]

(SBR

PlEEaE--AB[fB]aE=

AE[=E]

[E]E

I)PEEaE

= C.

In the following we observe that YE > 0. To see this, assume 9E > 0. Then,

(5.24)

Y= (9E 9BYN)

- ( E,,

0)

545

PROPERTY

GLOBAL CONVERGENCE

is a feasible solution of the dual problem (5.19) of (D). Choosing x as an interior point of ', we see that the pair (x, y) satisfies the strong complementarity condition of Lemma 5.3. This implies that T is the whole set of optimal solutions of (D), which is a contradiction to the assumptions of Lemma 5.4. Thus we have YE> 0. Since the sign of each component of 9E and PEEE is the same and alE = ctx c0 > 0, we have EEE

(5.25)

EE

aE

0

CYElE

On the other hand, the condition = 1

1E PEEE

(5.26)

is imposed on PEE3E, as PEE1E = 1E is always satisfied due to Lemma 4.3 (see (5.7)). Minimizing PtPIEE1E = IIPEEfEII2under the conditions of (5.25), (5.26), we obtain

the estimate (5.20). This completes the proof. D We observe two relations which follow from Lemma 5.4. Since PEE1E= 1E as was mentioned above, we have, under the assumptions of Lemma 5.4, IIPEEE

(5.27)

-

-

1E/kI PEE_E = E/kE

1 k

1 k(k - 1)

Since

IIPEE3EII

(5.28)

IIPEE3-E

EEEE

2

-E/kI

PEPEEPE

1/k

1 + -

1/k

PEPEEPE

-

1/k

it also follows that

(5.29) LEMMA5.5.

1
" as -* oo and > 8' holds for all r, where 8' is an appropriate positive conctG(x(^))c/llC,E()ll2 stant. To prove this fact, we proceed as follows: Step 1.1. We choose an accumulation point x, and denote by " the face that contain x in its interior. We choose a vertex 7 of '. We show the existence of a subsequence {x(vr)} such that the relation

(6.2)

ctG( x(v))-1

v)

>

'> 0

l E( )112

holds for all r by taking an appropriate constant 8'. Step 1.2.. We show that /= Y, which implies that x is the vertex T. Together with the results of Step 1.1, this completes the proof of Step 1. Step 2.. We prove that the sequence {x})) has a unique accumulation point. Together with Step 1, we see the vertex 7 (= ,) is the limit point of {x() with the sequence {x(VT))satisfying the condition of the lemma. This completes the proof of the lemma. PROOF OF STEP 1.1. Let x be an accumulation point of {x(V)}.Denote by W the face of ., the interior of which x lies. Choose one of the vertices of &, and denote it by T. Let us consider the sequence (6.3)

s(') = ^/ll(V)ll

We can choose the convergent subsequence {x('v)} of {x(^) such that {x(V)} itself converges to x and each component s(VT) of {s(V)} converges or diverges to infinity. We denote by s the "limit point" of {s(O)}. (Strictly speaking, it is not rigorous to use the term "limit point" here because each component of s(5-) may not have a finite limit. However, in this case the components which do not have finite limits are guaranteed to diverge to infinity. It is sometimes useful in saving description to include formally oo as a number, and to define "the convergence to oo" as "the divergence to infinity". Here we adopt this convention. Then, formally s may contain ooas its components.) Let X be the set consisting of the indices such that sf>V) -- 0. X may be 0, but cannot be the whole index set {1,..., m}. Due to the definition of s(5') and the boundedness of the sequence {x()}, we have (xf) -- 0 when r tends to infinity, and hence x E P(X)

= {x E slA'x

= bx}.

(N.B. this formally holds also in the case of X = 0.) Denote by X the face P(X). Since W is the unique face which contains x in its interior, 3 is the smallest face as well. which contains x. Hence we have &"c_ . Since 7/c , we see Yc_ In the following we show that Y'is a dual nondegenerate face such that X = E(g) and 7c X. These facts follow trivially if X = 0. (We have '= 9 then.) Hence we deal with the case where X - 0. By the definition of the index set X, it is seen that (6.4)

)x( x())

-0.

Since '= P(X), we have X c E(g(). To show that E(gY) = X, it is enough to find a point x such that sx(x) = 0 and S(ix) > 0 (i A X). We consider the following

551

PROPERTY

GLOBAL CONVERGENCE

equation with respect to Ax(^ defined for each r: A' 1X(^') = sxX(>).

(6.5)

It is easy to see that equation (6.5) always has a solution, say, Ax(-) = x) - x. It is well known that if (6.5) has a solution, it also has a solution APx() whose norm is bounded by

IIAx(^)ll< Ml11I|5|11,

(6.6)

where M1 is a constant which depends only on Ax. For each r, let us consider the solution A\(") which satisfies (6.6), and put x(^) as follows: - AX ( (VT)= X(7)

(6.7)

Then, x(v-) is a point on S satisfying (6.8) (6.9)

a'x(' a()-

bi

- b = 0

a x(v) - bi - a i(

(i

X),

r)

> sI/(') - M1

JAlI. 11l('V)l\

= -II((T) ,II(s5vT)

- AM1?l

Ils^(ll)

(i 4 X).

Here equation (6.8) is satisfied for any 7 from the definition of x&(). The rightmosthand side of (6.9) is strictly greater than 0 for sufficiently large r, since we have si') -> s > 0 (i 4 X) while Ils>f)l - 0 when r goes to oo, and ll(v) I> 0 for all T. (i E X), and at(ir) - b > 0 Thus, for sufficiently large r, we have ati( v) - b= = X. From we see that which (i 4 X), (6.4), implies E(?) (6.10)

(E(X)(

X(-))

-->

as r tends to infinity. Since

" II IIS(,) II= I>0

holds for sufficiently large r, where 8 is a constant. Since E(Y) c E(Y), we can choose, due to Proposition 3.1, an appropriate pair of basis index sets

552

TAKASHI TSUCHIYA

c E(Y). U B(E(')) (B(E('T)), B(E(G)) associated with E(gT) such that B(E(')) , Since Im(ABu ) = R" and is a dual nondegenerate face, due to Proposition 3.2, there exists y such that Ay

(6.12)

=AB(())

(

B(E(=))YB(E

and

YB(E(^)) 40

))U B(E(

YN(E(s))O

=

))

C,

.

Let V) - [s ()] Y =

(6.13)

/l

Y)|1.

Then Ila(v)Ill is bounded from above as = Ila(v)l

I[s(v,ly

(6.14)

I II

though IIs(V)llmay diverge,because

With this

CaX,)

is written as follows:

ctG(x(VT))- c/||E()211

(6.15)

c E( /).

U B(E(g))

B(E(g))

ctG(x(E)

)P

2=

c/ll(^

a(

Below, we abbreviate E(X) as E to simplify the notation. Using Lemma 4.1 (with F := E(g)) and noting that a^T = 0, we have (6.16) (6. 16)

a

(V' )"P(VT)Oj(V"

a

)

E(I - aE iff(j)t Jf(EE - 2aE

P((E )()

B

QEB

+a^ )t(I

+

+

a

EE)

1,

a

BN)ts^T) BN )

^(T)t A(VT) ^(V) + a^ ZQBB'a

Let us denote the limit point for the convergentsequence (a (r)} by a = [s]y. Due to , = E(?) = 0. From (6.10) and Lemma 4.2, we have \QVEE -- 0, IVEBlI -> 0 and IIAQT)II 0 when T - o. Together with (6.14) and = 0, this the first, second and the fourth term in the right-hand side implies that (E of (6.16) converge to 0. On the other hand, the third term can be bounded from below for each r as follows: the definition of a(,

)

~

(6.17)

-1^ /., Tr)t S~(V+

a()t(I

Tr)(

S(Nv)

yB([

r)t

-1

V

a)"T)

S

+ AB'AN[s A

BI N) [BSBN

(Li' ? E( >M22min i0 E(S

f')

sf}

I|y|il|2,

)

AN

)

L At A AB

y

YB

553

GLOBAL CONVERGENCE PROPERTY

v

where M2 is an appropriate constant. Since we have YB- 0 and mini E()s converges to a positive constant mini xsi = mini E(g ) si > 0 when r -> o (recall the fact E(') = X and the definition of X), we see that (6.16) is greater than a positive constant. This implies that

c

c'((x c tG(r)

(6.18)

1

i2

12>y2 15E(Y))112

2

min s~

A

> 0

holds for sufficiently large r, where 6 is an appropriate constant. This completes the proof of Step 1.1. PROOF OF STEP 1.2.

Now we show that -=

Y. By definition, we have

ec S.

Recall that the subsequence {x( )} is a sequence which converges to x, the interior point of s. If 4 Y, since E(Y) D E(y), 11(V),11is strictly greater than a positive constant, say, 77,for sufficiently large r. This implies that the decrease of the objective function at the vth iteration can be bounded from above by a negative constant as follows for sufficiently large r: -1

-

c

(6.19)

1/2

< -{G(X) < --a^\1/2lE(\) 11


)> E/2 holds for sufficiently large or. On the other hand, as was mentioned in the note following (4.3), each slack variable is multiplied at most by the factor of (1 + u/). This means that the distance between x) and the nearest vertex v(x()) is multiplied at most by the factor of M3 ? (1 + /) times per iteration, where M3 is a constant determined only from the matrix A. Then, we have (6.20)

e/2