Feasibility and Constraint Analysis of Sets of Linear Matrix ... - CiteSeerX

Report 1 Downloads 38 Views
Feasibility and Constraint Analysis of Sets of Linear Matrix Inequalities Richard J. Caron, Tim Traynor University of Windsor, Windsor, Ontario, N9B 3P4, Canada, {[email protected], [email protected]}

Shafiu Jibrin Northern Arizona University, Flagstaff, Arizona, 86011, USA

We present a constraint analysis methodology for Linear Matrix Inequality (LMI) constraints. If the constraint set is found to be feasible we search for a minimal representation; otherwise, we search for an irreducible infeasible system. The work is based on the solution of a set covering problem where each row corresponds to a sample point and is determined by constraint satisfaction at the sampled point. Thus, an implementation requires a method to collect points in the ambient space and a constraint oracle. Much of this paper will be devoted to the development of a hit and run sampling methodology. Test results confirm that our approach not only provides information required for constraint analysis, but will also, if the feasible region has a non-void interior, with probability one, find a feasible point. Key words: linear matrix inequalities; positive semidefinite programming; feasibility; redundancy; irreducible infeasible sets

1.

Introduction

Linear Matrix Inequality (LMI) constraints are important in semidefinite programs (SDP), a class of problems of interest because they are very general(see, for example, the survey in (Vandenberghe and Boyd, 1996)) and because they can be solved using efficient barrier function methods (see, for example, (Alizadeh, 1995) and (Wolkowicz, 2001)). It has been established (see, for example, (Andersen and Andersen, 1995)) that for linear programs, the removal of redundancies can improve the performance of interior point algorithms, thereby increasing the size of the problem that can be solved. That the same is true for positive semidefinite programs was shown in (Jibrin, 1998); and it was also shown that the elimination of all redundancies and the detection of a irreducible feasible set, with deterministic methods is NP complete. If the feasible region is empty, the removal of all possible redundancies amounts to the detection of an irreducible infeasible set (Chinneck and Dravnieks, 1991). In this paper, we show how the framework for constraint analysis given by Caron and Traynor (2007), which builds on the work by Boneh (1984), can be applied to Linear Matrix Inequality (LMI) constraint sets.

1

The set of Linear Matrix Inequalities (LMI) of concern herein is of the form Aj0 +

n X

xi Aji  0,

j ∈ J = {1, . . . , q},

(1)

i=1

where Aji are mj × mj symmetric matrices and x ∈ Rn . Here A  0 means that the matrix A is positive semidefinite; A  0, that A is positive definite. We define, for each j, the function F j on Rn by F j (x) = Aj0 +

n X

xi Aji .

i=1

The feasible region defined by these inequalities is Z(J) = {x ∈ Rn |F j (x)  0, ∀j ∈ J }, which may, or may not, be empty. We assume throughout that for each j, there exists an x such that F j (x) is non-singular as F j (x) is non-singular for almost all x ∈ Rn , because of the following theorem, applied to the determinant of F j (x). Theorem 1. A polynomial function on Rn to R, is either identically 0 or non-zero almost everywhere. An elementary proof of this can be given, using only the Fubini Theorem. For details, see (Traynor and Caron, 2005). In section 2 of this paper we apply the framework for constraint analysis given by Caron and Traynor (2007), which builds on the work by Boneh (Boneh, 1984), to analyze the system (1). If the feasible region is nonempty, we show how our approach identifies redundancy and a feasible point, else we identify an irreducible infeasible set. In section 3 we present six algorithms for collecting information required for the analysis, section 4 provides details on positive definite intervals of matrix pencils necessary for implementation; and section 5 gives numerical results.

2.

A Framework for Constraint Analysis

Let Xj be the set of points in Rn satisfied by the j th LMI constraint. The intersection of all of these is the T feasible set, denoted Z(J). More generally, for J 0 ⊆ J, Z(J 0 ) = j∈J 0 Xj . We are interested in subsets J 0 of J such that Z(J 0 ) = Z(J). We call such a J 0 a reduction of J. The family {Xj }j∈J 0 , or the corresponding

J 0 , is irreducible if there is no proper reduction J 00 of J 0 . We find that J 0 is a reduction of J if and only if [

Xjc ⊇ Z(J)c ,

(2)

j∈J 0

which tells us that each infeasible point is in some Xjc , j ∈ J 0 , that is, violates some constraint of the reduction J 0 . 2

We denote by δj the indicator (or characteristic) function of Xjc and gather the δj into one “binary word valued indicator function” δ = (δj )j∈J . Thus, δ : Rn → {0, 1}J . Its sets of constancy, in other words, the equivalence classes [x] = δ −1 (δ(x)) = {y : δ(y) = δ(x)}, partition Rn . In terms of δ, the inclusion (2) can be expressed as follows. Theorem 2 (Caron and Traynor (2007)). The set J 0 is a reduction of J if and only if δ(x) · 1J 0 > 0, for all x with δ(x) 6= 0, where 1J 0 is the binary word y, with yj = 1, if j ∈ J 0 , 0 otherwise Thus, if we let E be the set of all possible words δJ (x) other than 0, thought of as a matrix whose rows are indexed by the infeasible equivalence classes, an irreducible reduction can be found by solving the standard Set Covering (SC) problem minimize subject to

Σy = 1 · y Ey ≥ 1, y binary,

where 1 is a vector of ones of appropriate dimension. Suppose we have a solution to the set covering problem that indicates an irreducible reduction J 0 . If the system is feasible, the J 0 provides an irreducible feasible set. If the system is infeasible, the J 0 provides an irreducible infeasible system (IIS). The word in E with smallest row sum then indicates a minimal infeasibility set (Chakravarti, 1994). Since any feasible solution y to the SC problem corresponds to a reduction, one needn’t actually find an optimal solution to obtain a benefit, and heuristics, such as Chvatal’s (Chvatal, 1979) greedy algorithm, are sufficient for the purpose. However, there is still the problem of the determination of the matrix E.

3.

Collecting covering matrix rows

Let us assume that we are sampling the points of Rn according to a distribution which is equivalent to Lebesgue measure in the sense that it has the same sets of measure zero and calculating the corresponding δ(x). Provided that there are no implicit equalities, the fact that the LMI constraints are convex entails that each δ(x) has a nonzero probability of being detected (Corollary 7 in (Caron and Traynor, 2007)). In our numerical experiments we did this by sampling uniformly in a bounding box. We refer to the method as Algorithm 1 and say that it “generates random observations in a box”. In Algorithm 1, as well as in all subsequent algorithms, M is the iteration limit, B is a sampling box centered at the origin with sides of length 2b, and x0 ∈ B is randomly chosen so that F j (x0 ) is invertible 3

for all j ∈ J. All algorithms start with the iteration counter k = 0 and if δ(x0 ) 6= 0, E = δ(x0 ), else E = ∅. Each algorithm outputs a “most feasible” point z with 1T δ(z) ≤ 1T δ(x), for all sampled points x. (This is the commonly used measure of feasibility in terms of the number of violated constraints, often abbreviated as N IN F , see, for example, (Chinneck, 2004).) Initially, z = x0 . After each algorithm is complete, the final step is to remove those rows of E that are redundant by virtue of bit-wise majorization, i.e., clean E, and then solve min{1T y | Ey ≥ 1, y ∈ {0, 1}q}. Algorithm 1 while k ≤ M do Choose x uniform in B Evaluate F j (x) for all j to calculate δ(x) If δ(x) 6= 0 append δ(x) to E If 1T δ(x) < 1T δ(z) set z = x Set k to k + 1 end while Another way to sample is along straight lines. In (Boneh and Golan, 1979) a “hit and run” algorithm was introduced for the identification of redundancy and feasible region boundedness. This led to the development of a family of variations of the method (Boneh, 1983; Smith, 1984; Telgen, 1979; Berbee et al., 1987; B´elisle et al., 1998, 1993). It was shown in (B´elisle et al., 1993) that these methods, under rather weak conditions generate points asymptotically uniformly in a bounded open set. We have described these methods in (Caron and Traynor, 2007). In our numerical studies we used variations of the Coordinate Directions (CD) hit and run method introduced by Telgen (1979) and published in (Boneh, 1983). Here, the straight lines are defined by a point x and a direction s taken to be a standard coordinate direction vector chosen in an equally likely way. The next point is chosen uniformly on the intersection of the line {x + ts : t ∈ R} with the region in question. In Algorithm 2, B is the region in question. For each generated point x + ts, we determine which of the F j (x + ts) = F j (x) + t

n X

si Aji ,

j = 1, · · · , q

(3)

i=1

are positive semidefinite, so that Algorithm 2 could be compared with Algorithm 1. It should be noted that if s is taken to be the ith coordinate vector, the matrix F j (x + ts) reduces to F j (x) + tAji . Also because of Theorem 1 each such matrix will be non-singular with probability 1. If one matrix should turn out to be singular, we simply choose a new point. We say that Algorithm 2 “generates CD observations in a box”. In Algorithm 3, we began taking more advantage of the fact that the points are generated along line segments as we suggested in (Caron and Traynor, 2007). Specifically, we determined, for each of the matrix pencils in Eq. (3), the t-interval on which it is positive semidefinite as described in section 4. The endpoints of these intervals are thought of as “crossing points”, 4

Algorithm 2 while k ≤ M do Select an integer i uniformly from {1, . . . , n} and set s to be the ith coordinate vector Select t uniformly from (−b − xi , b − xi ) Replace x with x + ts Evaluate F j (x) for all j to calculate δ(x) If δ(x) 6= 0 append δ(x) to E If 1T δ(x) < 1T δ(z) set z = x Set k to k + 1 end while for they are values of the parameter t, where the line x + ts crosses the constraint boundaries. Let the crossing points be denoted by their displacement from x in the direction s, i.e., by t`− < t`− +1 < . . . < t−1 < 0 < t1 < · · · < t`+ −1 < t`+ , where `− < 0 and `+ > 0. The number of crossing points is `+ − `− ≤ 2q. At crossing point t` we denote by j` the index of the constraint whose boundary is crossed. Let ti correspond to the boundary of F ji  0 and suppose that ti < ti+1 . As the parameter t increase beyond ti , the jith bit of δ(x + ts) changes (flips) from 0 to 1 or from 1 to 0. (Each index j will appear at most twice along the line, because of the convexity.) Each new observation δ(x) thus obtained is added as a new row of the set covering matrix E. We say that Algorithm 3 “generates observations along the line”. Algorithm 3 while k ≤ M do Select an integer i uniformly from {1, . . . , n} and set s to be the ith coordinate vector Select t uniformly from (−b − xi , b − xi ) Determine the crossing points t`− < . . . < t`+ Let δ = δ(x) for ` = 1, . . . , `+ do Flip δj` , the j`th bit of δ If δ 6= 0 and is not a row of E append δ to E If 1T δ < 1T δ(z) replace z with x + 0.5(t` + t`+1 )s end for Repeat the loop for ` = −1, . . . `− starting again with δ = δ(x) Replace x with x + ts Set k to k + 1 end while In algorithms 3, 4, 5, and 6, we determine t at the beginning of each iteration while we update x + ts at the end. (The actual update is xi + tsi as only the coordinate associated with the current search direction is changed.) This allows us to place t in a crossing point interval, i.e., t` < t < t`+1 and to determine the next δ(x) while collecting observations along the line. In Algorithm 4, we build on the advantage of collecting as we cross boundaries, appending only if a bit of the current observation changes from a 1 to a 0, for if two rows e and e0 of a set covering matrix E have 5

the property that e bitwise majorizes e0 , then the set covering problem with the row e removed has the same solution; that is, that row is redundant. It is shown in (Caron and Traynor, 2007) that if this strategy (which we are calling “cleaning as we cross”) is followed, the rows collected along one line are pairwise incomparable. Algorithm 4 while k ≤ M do Select an integer i uniformly from {1, . . . , n} and set s to be the ith coordinate vector Select t uniformly from (−b − xi , b − xi ) Determine the crossing points t`− < . . . < t`+ Let δ = δ(x) for ` = 1, . . . , `+ do Create δˆ from δ by flipping δj` , the j`th bit of δ If δˆj` = 0, δˆ 6= 0 and δˆ is not a row of E append δˆ to E If δˆj` = 0 and δ is a row of E, remove δ from E Flip δj` If 1T δ < 1T δ(z) replace z with x + 0.5(t` + t`+1 )s end for Repeat the loop for ` = −1, . . . `− starting again with δ = δ(x) Replace x with x + ts Set k to k + 1 end while We demonstrate with figure 1. The current iterate is x, the direction is s and the line segment AB. The next iterate is chosen to be uniform on the line segment AB. Starting from x and moving to the right, we cross the boundaries of constraints 2 and 32. In the region containing x, we have, say, the observation δ1 = (0, 1, 1, 1, 0, . . . , 0, 1). As we cross the boundary of constraint 2, the second bit of the observation changes from 1 to 0 and we get δ2 = (0, 0, 1, 1, 0, . . ., 0, 1). Since δ1 majorizes δ2 we can remove δ1 . As we continue in that direction we cross the boundary of constraint 32 to get δ3 = (0, 0, 1, 1, 0, . . ., 0, 0) which is majorized by δ2 and δ2 can be removed. Moving in the other direction from x we get, say, δ4 = (0, 0, 1, 1, 0, . . ., 1, 0) and δ5 = (0, 0, 1, 1, 1, . . ., 1, 0). Since δ1 is majorized by δ4 which is majorized by δ5 they can both be removed. The only observation put in E is δ3 . In Algorithm 5, we introduce a new strategy by changing the region of question. Instead of going to the ends of the box, we choose our new t uniformly in the interval where constraints currently satisfied will remain satisfied. We demonstrate with figure 1 showing that the next iterate is chosen to be uniform on the line segment AB along which constraints 1, 5, . . . , 31 remain satisfied. In Algorithm 6 we use the method of Algorithm 5 but also clean as in Algorithm 4. Theorem 3. In all algorithms if the feasible region has non-void interior, then, with probability 1, a feasible point will be found. Proof. We prove the result only for Algorithm 5. We begin in the sampling box. With probability 1, using the CD algorithm, a point a1 will be found interior to one or more of the regions Xj . Let C1 be 6

(0,1,1,1,0,...,1,1) (0,0,1,1,0,...,0,0)

x s

A

B

(0,1,1,1,0,...,0,1) (32) (0,0,1,1,0,...,0,1)

(31)

(0,1,1,1,1,...,1,1)

(2)

(5)

The next iterate is uniform on the interval [A,B]

Figure 1: An iteration of Algorithm 4

(0,1,1,1,0,...,1,1) (0,0,1,1,0,...,0,0)

x s

A

B

(0,1,1,1,0,...,0,1) (32) (0,0,1,1,0,...,0,1)

(31)

(0,1,1,1,1,...,1,1)

(2)

(5)

The next iterate is uniform on the interval [A,B]

Figure 2: An iteration of Algorithm 5

7

Algorithm 5 while k ≤ M do Select an integer i uniformly from {1, . . . , n} and set s to be the ith coordinate vector Determine the crossing points t`− < . . . < t`+ From those crossing points find the step lengths t− < 0 < t+ that define the line segment along which the same constraints are satisfied as for x Select t uniformly from (t− − xi , t+ − xi ) Let δ = δ(x) for ` = 1, . . . , `+ do Create δˆ from δ by flipping δj` , the j`th bit of δ If δˆj` = 0, δˆ 6= 0 and δˆ is not a row of E append δˆ to E If 1T δ < 1T δ(z) replace z with x + 0.5(t` + t`+1 )s end for Repeat the loop for ` = −1, . . . `− starting again with δ = δ(x) Replace x with x + ts Set k to k + 1 end while

Algorithm 6 while k ≤ M do Select an integer i uniformly from {1, . . . , n} and set s to be the ith coordinate vector Determine the crossing points t`− < . . . < t`+ From those crossing points find the step lengths t− < 0 < t+ that define the line segment along which the same constraints are satisfied as for x Select t uniformly from (t− − xi , t+ − xi ) Let δ = δ(x) for ` = 1, . . . , `+ do Create δˆ from δ by flipping δj` , the j`th bit of δ If δˆj` = 0, δˆ 6= 0 and δˆ is not a row of E append δˆ to E If δˆj` = 0 and δ is a row of E, remove δ from E Flip δj` If 1T δ < 1T δ(z) replace z with x + 0.5(t` + t`+1 )s end for Repeat the loop for ` = −1, . . . `− starting again with δ = δ(x) Replace x with x + ts Set k to k + 1 end while

8

the intersection of those constraint regions containing a1 . The CD algorithm then starts, remaining in C1 . Continuing recursively, once Ck has been found and ak is a point in the interior of Ck , the CD algorithm is run finding only points inside Ck . Since the CD algorithm is asymptotically uniform in Ck , if it is not the feasible region, then a new point ak+1 will be found interior to a new Xj , determining a new Ck+1 . Eventually, all q constraints will be used, and the algorithm will reach a feasible point. With respect to finding feasibility, Algorithms 5 and 6 should be more successful since at each stage Ck+1 is a proper subset of Ck so that the probability of finding a feasible point in one iteration in Ck+1 is greater than for Ck . This is demonstrated in the numerical results in section 5.

4.

The PSD Interval for a Matrix Pencil

Let us simplify notation of (3), let V, W be symmetric matrices, with W 6= 0, and consider the pencil P : t 7→ V + tW.

(4)

The set of positive definite matrices is an open convex cone and the set of positive semidefinite matrices is a closed convex cone. It follows that the set T = {t ∈ R : P (t)  0} is an open interval, the positive definite interval (of P ) and T = {t ∈ R : P (t)  0} is a closed interval (bounded or unbounded), the positive semidefinite interval. Let p(t) = det P (t). We refer to p as the determining polynomial, for its roots, in the cases of present interest, determine the intervals T and T . Lemma 4. The interval T is not all of R. The determining polynomial p has at least one root. If p is not identically 0, then T is the interior of T . Notice, by the way, that this will hold whenever T is non-empty. Proof. Since W 6= 0, we choose an eigenvector x of W with a non-zero eigenvalue. Then, t 7→ x>P (t)x is a straight line with a non-zero slope x> W x, so must be somewhere negative. Thus, T 6= R. This straight line must also be zero somewhere, so p has at least one root. The open interval int T contains T . Suppose (int T ) \ T 6= ∅. Then, for some non-zero x, the straight line t 7→ x> P (t)x is everywhere non-negative in T and takes a minimum value 0 in the interior; hence it is constantly 0. Thus, P (t) is everywhere singular, that is, p is identically zero. Thus, if p is not identically zero, int T ⊇ T , as required. Re-examining the slope argument in the proof of the Lemma 4, considering the possible eigenvalues of W , we deduce: 9

Lemma 5. If W  0 then T is of the form [t, +∞), where t is the largest root of p; if W ≺ 0, it is of the form (−∞, ¯t ], with ¯ t the smallest root of p. If W 6 0, then T is bounded below and if W 6 0, then T is bounded above. In case one of V and W is non-singular, the determining polynomial p is not identically zero, so has finitely many roots, which can be expressed as eigenvalues. If V is non-singular, which it will almost surely be in our application, the roots are negative inverses of the eigenvalues of V −1 W . Indeed, p(t) = det(V + tW ) = 0,

(5)

holds if and only if det(− 1t I − V −1 W ) = 0, that is, if and only if − 1t is an eigenvalue of V −1 W . Similarly, if W is non-singular, the roots of p are the negatives of the eigenvalues of V W −1 . We will concentrate on the case V is non-singular. As we have noted, the roots of the determining polynomial p are the negative inverses of the non-zero eigenvalues of V −1 W . If V is positive definite, then these are the same as the eigenvalues of the symmetric matrix V −1/2 W V −1/2 . By Sylvester’s Law of Inertia, this matrix has the same inertia as W . In particular it is positive definite, negative definite, positive semidefinite, or negative semidefinite if and only if W is likewise. Replacing V by −V , we see that if V is negative definite, the inertia reverses and becomes that of −W . The case V is positive definite. When V is positive definite, 0 ∈ T and T is the closure of T . Thus, t¯ is either the smallest positive root of p or is +∞, if there is none; t is either the largest negative root of p or is −∞. Translating this into a statement about eigenvalues of V −1 W , we obtain: Theorem 6. Suppose that V is positive definite. Let λmax be the largest eigenvalue of V −1 W and λmin the smallest. Then, the positive semidefinite interval for V + tW is given by  −1  if W  0, equivalently if λmin ≥ 0 [−λmax , +∞) −1 T = (−∞, −λmin ] if W  0, equivalently if λmax ≤ 0   −1 [−λ−1 , −λ ] otherwise. max min (Recall that the case W = 0 has been excluded from the outset.)

The case V is negative definite. From Lemma 5, we know the form of T when W is positive definite or negative definite. Taking into account the reversal of the inertia mentioned above, we find, when V is negative definite, that the formulas turn out the same as in Theorem 6. If W is neither positive nor negative definite, we find that T is empty. 10

Theorem 7. Suppose V is negative definite. Let λmax be the largest eigenvalue of V −1 W and λmin the smallest. Then the positive semidefinite interval T is given as follows.  −1  [−λmax , +∞), if W  0, equivalently, if λmax < 0, T = (−∞, −λ−1 if W ≺ 0, equivalently, if λmin > 0, min ],   ∅, otherwise.

Proof. If W  0, then by Lemma 5, T = [t, +∞), where t is the largest root of p. Since V = P (0), is negative definite, P (t) is negative definite for all t ≤ 0; hence all the roots of p are positive, or equivalently, all the eigenvalues of V −1 W are negative (as we have already noted). The largest root of p corresponds to the largest eigenvalue of V −1 W , so T = [−λ−1 max , +∞) with λmax < 0. Similarly, W ≺ 0 corresponds to all the roots of p negative — that is, all the eigenvalues of V −1 W positive — and, T = (−∞, −λ−1 min ], with λmin > 0. In the remaining case, where W is neither positive definite, nor negative definite, T is empty. Indeed, if W has a zero eigenvalue with eigenvector x, then x> P (t)x = x>V x < 0 for all t, while if it has a positive eigenvalue with eigenvector u and a negative eigenvalue with eigenvector v, then, for t ≤ 0, u>P (t)u < 0 and for t > 0, v> P (t)v < 0, so again there is no t for which P (t) is positive semidefinite. V nonsingular and indefinite. In this case, we can repeat the above arguments for the unbounded cases, W ≺ 0 or W  0, but we can’t guarantee that the eigenvalues of V −1 W are on one side of 0. Instead, we have to ensure that we work with eigenvalues of the correct sign. Theorem 8. Suppose V is non-singular and indefinite and W  0 or W ≺ 0, and let λmax− be the largest negative eigenvalue of V −1 W , and λmin+ be the smallest positive one. Then the positive semidefinite interval T is given by ( [−λ−1 max− , +∞), T = (−∞, −λ−1 min+ ],

if W  0, if W ≺ 0.

The case W indefinite, however, is not covered by this. In this case, one can use a “brute force” method: simply partition the real line by the roots of p (determined by the eigenvalues of V −1 W ) and choose a point in each such interval and check for positive definiteness. The search can be narrowed down by the following analysis. Let Q be an orthogonal n × n matrix such that 

D1 Q VQ= 0 >

where D1 , D2 are diagonal and positive definite.

11

 0 , −D2

(6)

Let 

Q> W Q =

E1 E3>

 E3 , E2

where E1 and E2 have the same sizes as D1 and D2 . Then Q> (V + tW )Q =



D1 + tE1 tE3>

 tE3 . −D2 + tE2

By Theorem 6 and Theorem 7, we can determine the positive semidefinite intervals for D1 + tE1 and −D2 + tE2 . We denote them by T1 and T2 and obtain: Theorem 9. T ⊆ T1 ∩ T2 and if E3 = 0, T = T1 ∩ T2 . Thus, if E3 is not a zero matrix, we need only check the roots inside T1 ∩ T2 , while if it is 0, we can obtain it immediately. Table 1: PSD interval for V non-singular, W 6= 0 using eigenvalues of V −1 W

V 0 V ≺0 V indefinite

W 0

W 0

W indefinite

[−λ−1 , +∞) ( max −1 [−λmax , +∞), (∅, if W is singular)

(−∞, −λ−1 min ] ( (−∞, −λ−1 min ]

−1 [−λ−1 max , −λmin ]

[−λ−1 max− , +∞) , if W  0

(−∞, −λ−1 min+ ] , if W ≺ 0



(∅, if W is singular)

In the cases not covered (V indefinite and W indefinite or singular), all we can say is T ⊆ T1 ∩ T2 .

5.

Examples and Numerical Results

We tested Matlab 7.1 implementations of the six algorithms on fourteen examples using a Dell Inspiron with Pentium processor (1.70GHz, 504 MB RAM). We use AlgI to refer to Algorithm I. The examples solved are described in table 2. The columns give the example number, the dimension n of the space, the number of constraints q, the value of b that defines the sampling box B; the state of feasibility, and the source. (j)

The first ten examples were randomly generated. Each matrix A0

is a diagonal matrix with non-zero (j)

entries sampled from U (0, 1) using the Matlab routine “rand”. All other matrices Ai

were generated with

the Matlab routine “sprandsym(n,density)”, with the density parameter set to 0.8. The exceptions are examples eight and ten, where, for each, the first (q − 1) constraints are generated as above and the q-th (q)

constraint is obtained from (q − 1)-st by setting A0

(q−1)

= −A0

(q)

− I(m(q − 1)) and Ai

(q−1)

= −Ai

for

i 6= 0. Consequently, examples eight and ten are infeasible with the last constraint in any IIS. The other 12

eight random problems have the origin as a feasible point. The last four examples are from the SDPLIB (Borchers, 1999) and they are feasible. Table 2: Description of the Examples

Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex.

1 2 3 4 5 6 7 8 9 10 11 12 13 14

variables n 2 2 5 2 6 5 7 2 25 25 6 172 13 13

constraints q 5 4 20 7 20 100 30 4 100 101 7 151 3 3

box radius b 100 100 1 10 10 1 1 10 10 10 10 0.5 10 10,000

feasible or infeasible feasible feasible feasible feasible feasible feasible feasible infeasible feasible infeasible feasible feasible feasible feasible

Source random random random random random random random random * random random * truss1 [Borchers (1999)] truss6 [Borchers (1999)] hinf1 [Borchers (1999)] hinf2 [Borchers (1999)]

The results are given in tables 3, 4, 5, 6, and 7; and they were generated from M = 1000 iterations. In tables 6 and 7 the results for Alg3 and Alg4 are grouped, as are the results for Alg5 and Alg6, since the iterates are identical for those algorithm pairs. Although it was emphasized in the pseudocode; the implementations of Alg3, Alg4, Alg5, and Alg6 switch to the CD algorithm when a feasible point is found. (In fact, this produces the same results as the given by the pseudocode, but a direct CD implementation eliminates unnecessary calculations.) This strategy fits within our SC framework understanding that detecting a necessary constraint using CD (or any hit and run method) is equivalent to the detection of a singleton row in the SC matrix E, where the single “1” corresponds to the detected constraint. In that sense, any of the hit and run algorithms can be considered very effective sampling techniques for this SC framework once a feasible point is known. The time in seconds for each algorithm to complete one thousand iterations is in table 3. This includes the time to collect the samples, to clean the SC matrix and to solve the SC problem. In table 4 we see the time for the cleaning, and, in brackets, the percentage of time used for cleaning. In many cases, cleaning takes the bulk (almost all) of the time. The effects of cleaning are given in table 5 where we give the number of SC rows before and after cleaning. We expect Alg2 to take less time than Alg1 as it updates function values taking advantage of the structure of the CD iteration. This is the case in all but the twelfth example. Likewise, we expect Alg4 and Alg6 to take less time than Alg3 and Alg5, respectively, as the “clean as you cross” paradigm is adopted. In table 3 this is the case except for Ex. 4 and Ex. 8, where Alg5 is faster that Alg 6. 13

Table 3: Time to Complete 1,000 iterations (seconds) Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex.

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Alg1 0.093 0.073 0.189 0.066 0.220 3.604 0.094 0.086 8.639 12.093 0.446 0.297 0.064 0.058

Alg2 0.028 0.028 0.146 0.040 0.214 2.869 0.006 0.027 5.482 7.569 0.018 0.370 0.049 0.026

Alg3 0.038 0.011 0.347 0.038 2.200 138.977 0.086 0.027 331.587 544.406 0.013 0.897 0.022 0.008

Alg4 0.028 0.003 0.241 0.004 1.133 51.055 0.009 0.026 87.841 140.101 0.012 0.093 0.002 0.002

Alg5 0.004 0.019 0.151 0.005 22.277 11.764 0.847 0.003 13.057 92.387 0.015 6.375 0.011 0.023

Alg6 0.004 0.004 0.041 0.029 1.289 0.769 0.077 0.012 1.239 5.316 0.005 0.489 0.002 0.007

Table 4: Cleaning the SC Matrix - Absolute Time in seconds (% of total time) Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex.

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Alg1 0.065 (70) 0.066 (90) 0.173 (92) 0.058 (88) 0.213 (97) 3.583 (99) 0.057 (61) 0.054 (63) 8.599 (99) 12.021(99) 0.418 (94) 0.263 (89) 0.043 (67) 0.052 (90)

Alg2 0.002 0.003 0.120 0.002 0.186 2.860 0.005 0.002 5.467 7.559 0.008 0.338 0.039 0.012

(7) (11) (82) (5) (87) (99) (83) (7) (99) (99) (44) (91) (80) (46)

Alg3 0.012 (32) 0.002 (18) 0.346 (99) 0.012 (32) 2.168 (99) 138.965 (99) 0.085 (99) 0.025 (97) 331.518 (99) 544.372 (99) 0.012 (92) 0.880 (98) 0.009 (41) 0.003 (38)

14

Alg4 0.012 (11) 0.002 (68) 0.240 (99) 0.003 (75) 1.131 (99) 51.043 (99) 0.007 (78) 0.002 (07) 87.782 (99) 140.051 (99) 0.011 (92) 0.091 (98) 0.001 (50) 0.001 (50)

Alg5 0.003 (75) 0.002 (11) 0.124 (82) 0.004 (80) 22.274 (99) 11.757 (99) 0.831 (98) 0.002 (67) 13.054 (99) 92.293 (99) 0.009 (60) 6.373 (99) 0.010 (91) 0.017 (74)

Alg6 0.003 (75) 0.003 (75) 0.015 (37) 0.004 (14) 1.269 (98) 0.762 (99) 0.074 (96) 0.002 (17) 1.232 (99) 5.315 (99) 0.004 (80) 0.485 (99) 0.002 (99) 0.002 (29)

As CD iterates converge to a uniform distribution, and as convergence to uniformity depends on dimension, we expect, only for smaller values of n, the same overall performance in Alg1 and Alg2, in terms of the number of SC rows, number of necessary constraints, and feasibility detection. This is validated in tables 5, 6 and 7. For larger n we expect Alg1 to produce more SC rows than Alg2 as the sampled points are truly uniform. This is the case for problems with more than twenty-five variables (table 5). In table 5 we also see the benefit of the “clean as you cross” strategy in that Alg4 and Alg6 have fewer SC rows before cleaning than Alg3 and Alg5, respectively. In many example, e.g., Ex. 9, the decrease is dramatic. Of course, both algorithm pairs will produce the same SC rows after cleaning, so, as expected, table 5 shows that Alg3 and Alg5 have the same number of SC rows after cleaning as Alg4 and Alg6, respectively. We expect Alg3 to produce more SC rows before cleaning than Alg2 as Alg3 collects observations along the line segments. The evidence is in table 5. Even after cleaning, in most cases there are more SC rows, an additional benefit. Table 5: Cleaning the SC Matrix - Reduction in the Number of Rows Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex.

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Alg1 6→3 8→3 264 → 9 5→2 64 → 35 711 → 269 3→2 3→2 1000 → 998 1000 → 958 41 → 4 68 → 53 3→2 2→1

Alg2 7→3 8→3 238 → 21 4→2 77 → 38 634 → 232 3→2 3→2 878 → 643 891 → 634 40 → 3 94 → 30 4→1 2→1

Alg3 12 → 1 9→3 615 → 33 7→2 589 → 139 9523 → 1083 6→3 6→2 15184 → 3010 15650 → 3013 64 → 5 277 → 9 5→2 4→3

Alg4 11 →1 7→3 432 → 33 6→2 325 → 139 3518 → 1083 5→3 6→2 4040 → 3010 4080 → 3013 60 → 5 17 → 9 5→2 4→3

Alg5 11 → 2 9→3 309 → 4 17 → 2 10261 → 17 5582 → 11 1341 → 15 6→2 5997 → 36 36163 → 16 45 → 6 1878 → 62 7→3 5→3

Alg6 10 →2 5→3 26 → 4 12 → 2 620 → 17 376 → 11 117 → 15 6→2 586 → 36 2444 → 16 17 → 6 143 → 62 7→3 5→3

In Alg5 and Alg6 we restrict our random search to regions with a monotonically increasing number of satisfied constraints. We expect better observations, i.e., rows with fewer violated constraints, compared to those generated by Alg1, Alg2, Alg3 and Alg4. In fact, Alg5 and Alg6 produce a feasible point for all feasible examples (see table 7); and they classify more of the necessary constraints (table 6). This validates theorem 3. However, the choice of the size b of the sampling box is important for the success of the algorithms and the full development of this methodology will need to make use of a dynamic scheme for adjusting b. Techniques such as those in Chinneck (2004) could be ideal.

15

Table 6: Number of Necessary Constraints Found Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex.

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Alg1 1 3 1 1 1 1 1 1 4 1 3 3 1 1

Alg2 1 3 1 1 1 1 1 1 3 1 3 2 1 1

(Alg3 & Alg4) 1 3 1 2 1 1 1 2 3 1 5 1 2 2

(Alg5 & Alg6) 2 3 4 2 17 11 15 2 36 1 6 62 3 3

Table 7: Feasibility (Iteration Number) Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex. Ex.

1 2 3 4 5 7 8 9 10 11 12 13 14

Alg1 No No No No No No No No No No No No No

Alg2 No No No No No No No No No No No No No

(Alg3 & 4) No Yes (61) No Yes (61) No No No No No No No No No

16

(Alg5 & 6) Yes (119) Yes (81) Yes (48) Yes (119) Yes (54) Yes (792) No Yes (496) No Yes (64) Yes (905) Yes (327) Yes (285)

Figures 3 to 5 are about Ex. 1. In figure 3 we see the boundaries of the feasible region. It shows two necessary constraints, correctly determined only by Alg5 and Alg6. The feasible region is not visible in the other figures where the scale is different by a factor of one hundred.

0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8

−1

−0.5

0

0.5

1

Figure 3: Feasible region for Ex. 1.

In figure 4 we see the points generated by Alg4. The constraint boundaries are marked by a printing of the crossing points, and the“+” symbols mark the iteration points. In figure 5 we see the points generated by Alg6. A feasible point was found after 119 iterations after which the CD algorithm was used and further iteration points were in the feasible region, which is not visible. As the boundaries were also drawn by crossing points, they are more porous than in figure 4. As well, the iteration points are concentrated to the right side (rather than being uniform) in the box as iterates were chosen to move towards feasibility.

17

80 60 40 20 0 −20 −40 −60 −80

−100

−50

0

50

100

Figure 4: Algorithm 4 applied to example 1.

80 60 40 20 0 −20 −40 −60 −80

−100

−50

0

50

100

Figure 5: Algorithm 6 applied to example 1.

18

6.

Conclusions

We have shown the benefit of the SC approach for the analysis of LMI constraints and we have shown its promise as a probabilistic method for constraint analysis and for finding feasibility or approximate feasibility. Given the generality of the method, and its use of hit and run methods whose power is only starting to be well understood (see, for example, (Andersen and Diaconis, 2007) and (Lovasz, 1999)) it continues to show promise for constraint analysis and as a feasibility algorithm for very general sets of constraints. As in (Chinneck, 2004), where methods are presented to find approximately feasible points, there is a need for reasonable bounds on the variable to improve sampling, and to modify the sampling methods to accommodate problem sets with both linear and nonlinear equality constraints. Also, we have yet to test the effectiveness of the approximately feasible point returned by our algorithms. These topics are for the future.

Acknowledgment This work has been supported by NSERC, the Natural Sciences and Engineering Research Council of Canada through a discovery grant to the first author.

References Alizadeh, F. 1995. Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM J. Optimization 5 13–51. Andersen, E., K. Andersen. 1995. Presolving in linear programming. Mathematical Programming 71 221–245. Andersen,

H.C.,

P.

Diaconis.

2007.

Hit

and

run

as

a

unifying

device.

Http://stat.stanford.edu/ cgates/PERSI/papers/hitandrun062207.pdf. B´elisle, C.J., A. Boneh, R.J. Caron. 1998. Convergence properties of hit-and-run samplers. Comm. Statist. Stochastic Models 14 767–800. B´elisle, C.J., H.E. Romeijn, R.L. Smith. 1993. Hit-and-run algorithms for generating multivariate distributions. Mathematics of Operations Research 18 255–266. Berbee, H.C.P., C.G.E. Boender, A.H.G. Rinnooy Kan, H.E. Romeijn, C.L. Scheffer, R.L. Smith, J. Telgen. 1987. Hit-and-run algorithms for the identification of nonredundant linear equalities. Mathematical Programming 37 184–207.

19

Boneh, A. 1983. Preduce - a probabilistic algorithm for identifying redundancy by a random feasible point generator (RFPG). M.H. Karwan, V. Lotfi, J. Telgen, S. Zionts, eds., Redundancy in Mathematical Programming. Springer-Verlag, Berlin, 108–134. Boneh, A. 1984. Identification of redundancy by a set-covering equivalence. J.P. Brans, ed., Operational Research ’84 . Elsevier Science Publishers B.V.(North Holland), Amsterdam, 407–422. Boneh, A., A. Golan. 1979. Constraints redundancy and feasible region boundedness by random feasible point generator (RFPG). Presented at the EURO III Conference, Amsterdam, April 9-11(22pp). Borchers, B. 1999. SDPLIB 1.2, a library of semidefinite programming test problems. Optimization Methods and Software 683–690. Caron, R.J., T. Traynor. 2007. A general framework for the analysis of sets of constraints. S.K. Neogy, R.B. Bapat, A.K. Das, T. Parthasarathy, eds., Mathematical Programming and Game theory for Decision Making. World Scientific, Singapore, 33–45. Chakravarti, N. 1994. Some results concerning post-infeasibility analysis. European Journal of Operational Research 73 139–143. Chinneck, J.W. 2004. The constraint consensus method for finding approximately feasible points in nonlinear programs. INFORMS J. Comput. 16 255–265. Chinneck, J.W., E.W. Dravnieks. 1991. Locating minimal infeasible constraint sets in linear programs. ORSA Journal on Computing 3 157–167. Chvatal, V. 1979. A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4 233–235. Jibrin, S. 1998. Redundancy in semidefinite programming. Ph.D. thesis, Carleton University Ottawa. Lovasz, L. 1999. Hit and run mixes fast. Mathematical Programming 86 443–461. Smith, R.L. 1984. Efficient monte carlo procedures for generating points uniformly distributed over bounded regions. Operations Research 32 1296–1308. Telgen, J. 1979. The CD algorithm. Private communication with Arnon Boneh. Traynor, T., R.J. Caron. 2005. The zero set of a polynomial. Technical Report WMSR 05-02, University of Windsor. Vandenberghe, L., S. Boyd. 1996. Semidefinite programming. SIAM Review 38 49–95. 20

Wolkowicz, H. 2001. Simple efficient solutions for semidefinite programming. Technical Report CORR 2001-49, University of Waterloo, Waterloo, Ontario, Canada.

21