Solvability of systems of linear interval equations
A Dissertation Submitted For The Award of the Degree of Master of Philosophy In Mathematics
Leena Soni School of Mathematics Devi Ahilya Vishwavidyalaya, (NACC Accredited Grade “A”) Indore (M.P.) 2013 – 2014
Contents
Page No.
Introduction
1
Chapter – 1 Approximate solution of linear equations
2
with given error bounds
Chapter – 2 Solvability of systems of linear interval equations
8
Chapter – 3 An existence theorem for systems of linear equations
20
References
27
Introduction The dissertation1 is a reading of four research papers listed in the references. These paper studies of linear interval equations/inequalities. solvability Let A = A, A = A|A ≤ A ≤ A be an m × n interval matrix and b = b, b = b|b ≤ b ≤ b an m-dimensional interval vector. Note that inequalities are taken component wise and it is assumed that A ≤ A and b ≤ b, so that both sets are nonempty. A system of linear interval equations, formally written as Ax = b.
(1)
It is the family of all systems of linear equations Ax = b,
(2)
A ∈ A, b ∈ b.
(3)
with Here, we study the question of solvability of general systems of linear interval equations. A system (1) is called solvable if each system in the family (2), (3) is solvable (i.e., has a solution). The chapter 1 is the original work which describes an approximate solution of linear equations with given error bounds. Conditions are established under which a given approximate solution of a system of n linear equations with n unknowns is the exact solution of a modified system whose coefficients and right - hand sides are within a given neighbourhood of those of the original system. The chapter 2 is about solvability of systems of linear interval equations. The main result is (Theorem 2.4). The solvability of Ax = b is characterised in terms of non - negative solvability of a finite number of linear systems. We also prove result on non - negative solvability of the system Ax = b. Further, uniqueness of solution is also given under orthogonality condition. Some results on linear interval inequalities are also proved. The chapter 3 gives a constructive way of finding a solution of theorem 2.4.
1
We are thankful to Prof. Jiri Rohn, Institute of computer science, Czech Academy of sciences for his prompt responses.
1
Chapter 1 Approximate solution of linear equations with given error bounds
1.1
Introduction
Consider a system of n simultaneous linear equations Ax = b,
(1.1)
where the elements of the coefficient matrix as well as those of the columns b of right-hand sides are only known with limited accuracy. In this chapter, conditions are established under which a given approximate solution of a system of n linear equations with n unknowns is the exact solution of a modified system whose coefficients and right-hand sides are within a given neighbourhood of those of the original system. Consider the special case in which the coefficients Aij , in (1.1), are precisely known, whereas the typical right-hand side may have any value from bi − ∆bi to bi + ∆bi , i = 1, 2, ...., n where the tolerances ∆bi are non-negative. On account of rounding errors, the computed solution x0 of the system (1.1) will not exactly satisfy these equations, that is, the residuals r = Ax0 − b
(1.2)
will not all vanish. In view of the lack of precision of the right-hand sides one would neverthless accept x0 as solution of the system (1.1) if, |ri | ≤ ∆bi ,
i = 1, 2, ..., n.
(1.3)
According to equation (1.3) the right-hand sides defined by bi + δbi = bi + ri 2
(1.4)
lie within the given tolerance limits of the original right-hand sides bi and by making this admissible modification of the right-hand sides, we obtain a system of equation that has the (exact) solution x0 . Given the system (1.1), a positive tolerance matrix ∆A as well as a positive tolerance column ∆b, and an approximate solution x0 of the system (1.1) can we construct a modified system, (A + δA)x = b + δb
(1.5)
with |δA| ≤ ∆A, |δb| ≤ ∆b,
[∆A ≥ 0, ∆b ≥ 0]
(1.6)
that has x0 an exact solution?
1.2
Arbitrary Tolerances
We introduce the column
x0 ξ = ··· 1
(1.7)
which has (n + 1) elements and the matrices, δC = (δA| − δb),
∆C = (∆A|∆b),
(1.8)
which are n by (n + 1). Since x0 is to be the solution of the modified system (1.5), we have δCξ = −r, (1.9) where r is defined by (1.2). Because x0 δCξ = (δA| − δb) · · · 1 0 = δAx − δb = b − Ax0 [use(1.5)] = −r
Now, we wish to decide whether there exists a matrix δC satisfying the equation (1.9) and the inequality |δC| ≤ ∆C. (1.10) Observe that this question may be discussed separately for each row of δC. In discussing the ith row, we may assume that ri 6= 0, otherwise ith equation in (1.1) is exactly satisfied and hence δCij = 0 for j = 1, 2, .., n + 1. 3
So, if ri 6= 0, the ith row of δC must satisfy the relations n+1 X
δCij
j=1
ξij − ri
=1
(1.11)
(j = 1, 2, ..., n + 1)
(1.12)
and |δCij | ≤ ∆Cij ,
which do not involve elements from other rows of δC. It will be convenient to interpret the n + 1 elements δCij of the ith row of δC as rectangular cartesian coordinates in an (n + 1)-dimensional space. Equation (1.11) then represents a hyperplane that is normal to the vector with the components ξj vj = − , ri
(j = 1, 2, ...., n + 1)
(1.13)
and lies on the side of the origin towards which this vector is pointing. The points with coordinates δCij satisfying (1.12) lie inside or on a rectangular parallelepiped that is centered at the origin and has edges of the length 2∆Cij parallel to the j th coordinate axis. To decide the whether the ith equation of the system (1.1) can be modified in the desired manner. We first suppose that none of the values ξj vanishes and determine the vertex of this parallelepiped that lies in the orthant containing the vector (1.13). Through this vertex, which has the coordinates, ξj (j = 1, 2, ..., n + 1) (1.14) ∆Cij sgn − ri we pass a supporting plane of the parallelepiped that is normal to the vector (1.13). The equation of this plane has the same left-hand side as equation (1.11) and the right-hand side n+1 X ξj (1.15) λi = ∆Cij ri j=1
since (sgnx) x = |x|. i.e., n+1 X j=1
δCij
ξj − ri
n+1 X ∆Cij ξj = . λi ri j=1
Observe that if λi < 1 then the ith equation has no solution. If λi ≥ 1 then ∆Cij ξj ∆Cij δCij = sgn − = sgn (−ξj .ri ) λi ri λi 4
(1.16)
satisfy n+1 X ∆Cij j=1
λi
ξj sgn − ri
n+1 ξj ξj 1 X − = ∆Cij = 1 ri λi j=1 ri
i.e., δCij satisfies (1.11) and (1.12). This means the ith equation can be modified in the desired manner. If λi = 1, we get unique δCij . Next, if λi > 1, we have |δCij | < ∆Cij . Hence equation (1.16) represents only one of infinitely many possible modifications of the ith row. Note that the last form of (1.16) also applies to the case ri = 0. The preceding discussion remains valid when one or several value ξj vanish except that the supporting plane (1.11) then contains an entire edge or face of the parallelepiped. In view of the equation of the hyperplane (1.11), λ = mini (λi − 1)
(1.17)
provides a measure for the slack left by the given tolerances. If λ is negative then there is at least one i for which λi < 1. Therefore the problem has no solution. If λi = 1 the tolerances admit a single solution for atleast one row. If λi > 1, there exist an infinity of solutions for each row. In view of (1.7) and (1.8), The equation (1.15) can be written as, n X
ξn+1 ξj λi = ∆Aij + ∆bi r r i i j=1 " n # 0 1 X = ∆Aij xj + ∆bi |ri | j=1 since ξj = x0j , ξn+1 = 1 Next, δCij
ξj ∆Cij sgn − = λi ri
⇒ δCij = and
∆Cij sgn (−ξj .ri ) λi
δAij
0 xj ∆Aij = sgn − λi ri
−δbi
∆bi 1 sgn − = λi ri 5
(1.18)
⇒
ri ∆Aij sgnx0j , δAij = − Pn ∆Aij x0 + ∆bi
(1.19)
j
j=1
ri ∆bi . 0 j=1 ∆Aij xj + ∆bi
δbi = Pn
(1.20)
using (1.18) If λi > 1, |δcij | < |∆Cij | and (1.18) gives |ri |
0,
∆bi > 0.
i.e., ∆Aij = c, ∆bi = c, if ∆Aij > 0, ∆bi > 0. Note that c > 0 and hence sgn∆Aij = sgn∆bi = 1
(1.21)
For example, If the matrix A is tridiagonal, we may set ∆Aij = c for all coefficients Aij in the principal diagonal and immediatly above and below it, but ∆Aij = 0 for all other coefficients Aij . When c is given, the formulas (1.18) through (1.20) become # " n 0 1 X ∆Aij xj + ∆bi λi = |ri | j=1 " n # c X 0 = x +1 |ri | j=1 j Infact we can write " n # c X 0 λi = x sgn∆Aij + sgn∆bi , |ri | j=1 j 6
(1.22)
Next, by (1.19) δAij = − Pn
ri ∆Aij sgnx0j ∆Aij x0 + ∆bi j
j=1
ri sgn x0j = − Pn 0 j=1 xj + 1
[since ∆Aij = ∆bi = c]
again we can write, inview of (1.21), δAij = − Pn
j=1
ri sgn(x0j ∆Aij ) x0 sgn∆Aij + sgn∆bi ,
(1.23)
j
It is easy to see that sgn(x0j ∆Aij ) = sgn(x0j )sgn(∆Aij ) Similarly, by (1.20), δbi = Pn
j=1
r sgn∆bi i x0 sgn∆Aij + sgn∆bi ,
(1.24)
j
The minimum value of c is given by using (1.23) and (1.24) as ri = δAij x0j − δbi 0 ri 0 .∆A + sgn∆b sgn x x ij i j j 0 j=1 xj sgn∆Aij + sgn∆bi
= − Pn
0 ri xj + sgn∆bi sgn∆A ij 0 j=1 xj sgn∆Aij + sgn∆bi
= − Pn
= Pn
j=1
r sgn(ri ) i 0 x sgn∆Aij + sgn∆bi
since sgn(x0j ∆Aij ) = sgn(x0j )sgn(∆Aij )
j
This gives, |ri | . 0 j=1 xj sgn∆Aij + sgn∆bi
min c = maxi Pn
7
(1.25)
Chapter 2 Solvability of systems of linear interval equations
2.1
Introduction
Let A = A, A = A|A ≤ A ≤ A be an m × n interval matrix and b = b, b = b|b ≤ b ≤ b an m-dimensional interval vector. Note that inequalities are taken component wise and it is assumed that A ≤ A and b ≤ b, so that both sets are nonempty. A system of linear interval equations, formally written as Ax = b.
(2.1)
It is the family of all systems of linear equations Ax = b,
(2.2)
A ∈ A, b ∈ b.
(2.3)
with Here, we study the question of solvability of general systems of linear interval equations with rectangular matrices. Definition 2.1. Solvability: A system (2.1) is called solvable if each system in the family (2.2), (2.3) is solvable (i.e., has a solution). Except for the trivial case of A = A and b = b the family (2.2), (2.3) consists of infinitely many linear systems. We shall show that (Theorem 2.4) a system (2.1) is solvable if and only if a finite number of linear systems are nonnegatively solvable (i.e., have non-negative solutions). These systems are formally written as: For each i ∈ {1, ..., m} the ith equation of such a system is either of the form Ax1 − Ax2 i = bi (2.4) 8
or of the form Ax1 − Ax2
i
= bi
(2.5)
Since for each of the m equations we have two options to choose, there are altogether 2m linear systems of this form in general. Observe that the matrix of each such system is of size m × 2n. But if the ith rows of A and A are equal and if bi = bi , then (2.4) and (2.5) coincide. Hence the q exact number of mutually different linear systems to be solved is 2 , where q is the number of nonzero rows of the matrix A − A, b − b . Throughout the chapter we shall use the following notation: For an interval matrix A = A, A , we define the center matrix 1 Ac = (A + A) 2 and the radius matrix
1 ∆ = (A − A) 2 Then A = Ac − ∆ and A = Ac + ∆, so that, we also can write A = [Ac − ∆, Ac + ∆] Similarly, we can write b = [bc − δ, bc + δ] where
1 1 b+b , δ = b−b . 2 2 For a vector x = (xi ), its absolute value is defined by |x| = (|xi |); convex hull of X is denoted by ConvX. We define bc =
Ym = {y ∈ Rm ; yj ∈ {−1, 1} for each j} ; i.e., ym is the set of all ±1 − vectors in Rm ; its cardinality is obviously 2m . Finally, for each y ∈ Ym we denote y1 0 . . . 0 0 y2 . . . 0 Ty = diag(y1 , ....., ym ) = .. .. . . .. . . . . 0 0 . . . ym Note: For each y ∈ Ym , Ac − Ty ∆ ∈ A, Ac + Ty ∆ ∈ A, bc + Ty δ ∈ b. 9
Theorem 2.2. (Farkas Lemma) Let A ∈ Rm×n and b ∈ Rm . Then the system Ax = b, x ≥ 0, has a solution if and only if each p ∈ Rm with AT p ≥ 0 satisfies bT p ≥ 0. Equivalently, exactly one of the following systems has a solution. System 1 : Ax = b, x ≥ 0, System 2 : AT p ≥ 0, bT p < 0. Theorem 2.3. (The Oettli-Prager theorem) X = {x| |Ac x − bc | ≤ ∆ |x| + δ} .
(2.6)
If x satisfies the inequality in (2.6), then Ax = b for some A ∈ A and b ∈ b. Proof. For x, Ax = b for some A ∈ A and b ∈ b. |Ac x − bc | = |Ac x − Ax + b − bc | ≤ |Ac − A| |x| + |b − bc | [Recall: |x| = (|xi |)] ≤ ∆ |x| + δ Conversely, let |Ac x − bc | ≤ ∆ |x| + δ for some x. Define a y ∈ Rn by yi =
(Ac x − bc )i , if (∆ |x| + δ)i > 0 (∆ |x| + δ)i
= 1,
otherwise (i = 1, 2, ...., n)
Then, clearly |y| ≤ e and Ac x − bc = Ty (∆ |x| + δ)
10
Put z = sgnx, so |x| = Tz x. We get (Ac − Ty ∆Tz ) x = Ac x − bc + bc − Ty ∆Tz x = Ty (∆ |x| + δ) + bc − Ty ∆Tz x [since |x| = Tz x]
= bc + Ty δ Since |y| ≤ e, we have |Ty ∆Tz | ≤ ∆ and |Ty δ| ≤ δ so that
Ayz = Ac − Ty ∆Tz ∈ A by
= bc + Ty δ ∈ b
Therefore, x ∈ X. The following shows that solvability of Ax = b can be characterized in terms of nonnegative solvability of a finite number of linear systems. Theorem 2.4. A systems of linear interval equations Ax = b is solvable if and only if for each y ∈ Ym the system (Ac − Ty ∆) x1 − (Ac + Ty ∆) x2 = bc + Ty δ,
(2.7)
x1 ≥ 0, x2 ≥ 0,
(2.8)
has a solution x1y , x2y . Moreover, if this is the case, then for each A ∈ A, b ∈ b the system Ax = b has a solution in the set conv x1y − x2y ; y ∈ Ym . Proof. “Only if”: Let Ax = b be solvable. Suppose that (2.7), (2.8) does not have a solution for some y ∈ Ym . If we write x = x 1 , x2 A = [Ac − Ty ∆ − (Ac + Ty ∆)] b = bc + Ty δ Then (2.7) and (2.8) becomes Ax = b, 11
x ≥ 0. By Farkas’ lemma, there is p ∈ Rm such that AT p ≥ 0, bT p < 0. i.e., (Ac − Ty ∆)T p ≥ 0,
(2.9)
(Ac + Ty ∆)T p ≤ 0,
(2.10)
(bc + Ty δ)T p < 0.
(2.11)
Now, (2.9) ⇒
ATc p ≥ ∆T Ty p,
(2.10) ⇒ ATc p ≤ −∆T Ty p. So that
Since TyT = Ty
∆T Ty p ≤ ATc p ≤ −∆T Ty p ⇒ ATc p ≤ −∆T Ty p ≤ −∆T Ty p ≤ ∆T |p| .
The Oettli-Prager theorem (equation 2.6) as applied to the system T Ac − ∆T , ATc + ∆T z = [0, 0] [Take bc = 0, δ = 0] shows that there exists a matrix A ∈ A such that AT p = 0
(2.12)
Again by Farkas’ lemma applied to (2.11) and (2.12), the system, Ax = bc + Ty δ has no solution, which contradicts our assumption since A ∈ A and bc + Ty δ ∈ b. “If”: Conversely, let for each y ∈ Ym the system (2.7), (2.8) have a solution x1y , x2y . Let A ∈ A, b ∈ b. To prove that the system Ax = b has a solution. We first show that Ty Axy ≥ Ty b holds for each y ∈ Ym , where xy = x1y − x2y . Let y ∈ Ym . Then we have, Ty (Axy − b) = Ty Axy + Ty Ac xy − Ty Ac xy + Ty bc − Ty bc − Ty b = Ty (Ac xy − bc ) + Ty (A − Ac ) xy + Ty (bc − b) ≥ Ty (Ac xy − bc ) − ∆ |xy | − δ 12
[Since |Ty (A − Ac ) xy | ≤ ∆ |xy | , |Ty (bc − b)| ≤ δ] Thus, Ty (Axy − b) ≥ Ty Ac x1y − x2y − bc − ∆ x1y − x2y − δ
Since xy = x1y − x2y
≥ Ty Ac x1y − x2y − bc − ∆ x1y + x2y − δ = Ty Ac x1y − x2y − bc − Ty2 ∆ x1y + x2y − Ty2 δ
2 Ty = I
= Ty (Ac − Ty ∆) x1y − (Ac + Ty ∆) x2y − (bc + Ty δ) =0
since x1y , x2y is a solution of(2.7), (2.8)
⇒ Ty Axy ≥ Ty b Now, we shall prove that the system of linear equations, X λy Axy = b,
(2.13)
(2.14)
y∈Ym
X
λy = 1
(2.15)
y∈Ym
has a solution λy ≥ 0, y ∈ Ym . By Farkas lemma, it suffices to show that for each p ∈ Rm and each p0 ∈ R1 , Axy T (p p0 ) ≥0 1 i.e., pT Axy + p0 ≥ 0 for each y ∈ Ym (2.16) implies T
(p, p0 )
b 1
≥0
i.e., pT b + p0 ≥ 0 . Define y ∈ Ym , For i = 1, 2, ...., m. yi = −1 if =1 if
pi ≥ 0 pi < 0
Then clearly sgn pi = −sgn yi ⇒ p = −Ty |p| 13
(2.17)
from equation (2.13), (2.16) we have, then pT b + p0 = − |p|T Ty b + p0
T Ty = Ty
≥ − |p|T Ty Axy + p0 [use : (2.13)] = pT Axy + p0
[p = −Ty |p|]
≥0
[by (2.16)]
Hence the system (2.14), (2.15) has a solution λy ≥ 0, y ∈ Ym . Put X x= λy x y y∈Ym
then Ax = b by (2.14), (2.15) and x belongs to the set conv {xy ; y ∈ Ym } = conv x1y − x2y ; y ∈ Ym by (2.15). Remark 2.5. The number of mutually different linear system to be solved is q 2 where q is the number of non-zero rows of the matrix A − A, b − b . Thus the number of mutually different linear systems is generally exponential. Let us analyse the equations in system (2.7). If yi = 1 then the ith rows of Ac − Ty ∆ and Ac + Ty ∆ are equal to the ith rows of A and A, respectively, and (bc + Ty δ)i = bi . This means that in this case the ith equation of (2.7) has the form (2.18) Ax1 − Ax2 i = bi Similarly in case yi = −1 it is of the form Ax1 − Ax2
i
= bi .
(2.19)
Hence we can see that the family of systems (2.7) for all y ∈ Ym is just the family of all systems whose ith equations are either of the form (2.18) or of the form (2.19) for i = 1, 2, ....., m. The number of mutually different such systems is exactly 2q , where q is the number of non-zero rows of the matrix (∆, δ). Note: In this “if” part of the proof we proved that for each A ∈ A and b ∈ b the equation Ax = b has a solution in the set conv x1y − x2y ; y ∈ Ym . The proof relying on the Farkas lemma, was purely existential. However, such a solution can be found in a constructive way. This is done in chapter 3. Now, we shall show that the above theorem (2.4) gives a unified view of following theorems. 14
Definition 2.6. Non-negative solvability: A linear interval system Ax = b is called nonnegatively solvable if each system Ax = b with A ∈ A, b ∈ b is non-negatively solvable. Theorem 2.7. A system of linear interval equation Ax = b is nonnegatively solvable if and only if for each y ∈ Ym the system (Ac − Ty ∆) x = bc + Ty δ
(2.20)
has a nonnegative solution xy . Moreover, if this is the case, then for each A ∈ A, b ∈ b the system Ax = b has a solution in the set Conv {xy ; y ∈ Ym } . Proof. The ith row of equation (2.20) is of the form, if yi = 1, (Ax)i = bi and, if yi = −1, Ax
i
= bi .
Hence, as in the proof of theorem (2.4), the system has a solution in the set conv {xy | y ∈ Ym }. Definition 2.8. Regular: A square interval matrix A is said to be regular if each A ∈ A is nonsingular. Theorem 2.9. An interval matrix A is regular if and only if for each y ∈ Ym the system (Ac − Ty ∆) x1 − (Ac + Ty ∆) x2 = y, (2.21) x1 ≥ 0, x2 ≥ 0, has a solution. Proof. If A is regular, then for each right-hand side b the system of linear interval equations Ax = b is solvable. Hence the system (2.7), (2.8) has a solution for each y ∈ Ym (Take bc = y, δ = 0). Further note that, the system (2.21) can be written as 1 (Ac − Ty ∆) 0 x =y 0 − (Ac + Ty ∆) x2
Remark 2.10. If we put additional complementary constraint then above solution turns out to be unique. 15
Theorem 2.11. Let A be regular. Then for each y ∈ Ym the system (Ac − Ty ∆) x1 − (Ac + Ty ∆) x2 = bc + Ty δ,
(2.22)
x1 ≥ 0, x2 ≥ 0, T x1 x2 = 0,
(2.23) (2.24)
has a unique solution x1y , x2y , and for the solution set X of Ax = b defined by X = {x; Ax = b for some A ∈ A, b ∈ b} , we have ConvX = Conv x1y − x2y ; y ∈ Ym .
(2.25)
Proof. We can write (2.22) as Ac x1 − x2 − Ty ∆ x1 + x2 = bc + Ty δ T
put x = x1 − x2 . As (x1 ) x2 = 0, the ith components of x1 and x2 both can not be non-zero. Therefore, as x1 ≥ 0, x2 ≥ 0 x1 + x2 = |x| Thus, (2.22) - (2.24) equivalently can be written as Ac x − Ty ∆ |x| = bc + Ty δ Its unique solution xy satisfies xy = x1y − x2y , so that, equation (2.25) takes the form, convX = conv {xy ; y ∈ Ym } .
Now, we can prove anologous results for systems of linear interval inequalities. Theorem 2.12. A system of linear interval inequalities Ax ≤ b is solvable if and only if the system Ax1 − Ax2 ≤ b, x1 ≥ 0, x2 ≥ 0, has a solution. Now, give another constructive proof of theorem 2.12. First, we define Definition 2.13. Weakly solvable: We say a system of linear interval inequalities is weakly solvable if it is solvable (Definition 2.1). 16
Definition 2.14. Strongly solvable: A system Ax ≤ b is called strong solvable if there exists an x0 satisfying Ax0 ≤ b for each A ∈ A, b ∈ b i.e., if all the systems Ax = b, A ∈ A, b ∈ b have a solution in common. Proposition 2.15. For a system Ax ≤ b, we have XS = x1 − x2 ; Ax1 − Ax2 ≤ b, x1 ≥ 0, x2 ≥ 0 the set of all strong solutions. Proof. Let x ∈ XS . Put x1 = x+ = max {x, 0} (componentwise) and x2 = x− = max {−x, 0} . Then x1 ≥ 0, x2 ≥ 0 and x = x1 − x2 . Furthermore, we define a matrix A columnwise by Aj = Aj if xj ≥ 0, = Aj if xj < 0. so that A ∈ A. Then, since x is a strong solution, we have Ax1 − Ax2 = Ax ≤ b. Conversely, let x1 , x2 be a nonnegative solution to Ax1 − Ax2 ≤ b and let x = x1 − x2 . Then for each A ∈ A , b ∈ b we have that Ax = A(x1 − x2 ) ≤ Ax1 − Ax2 ≤ b ≤ b, hence x is a strong solution. Theorem 2.16. A system of linear interval inequalities Ax ≤ b is (weakly) solvable if and only if all the systems Ax ≤ b, A ∈ A b ∈ b, have a solution in common i.e., Ax ≤ b is strongly solvable.
17
Proof. Obviously, a strongly solvable system Ax ≤ b is also weakly solvable. Now, we shall prove that if a system Ax ≤ b is not strongly solvable then it contains an unsolvable system Ax0 ≤ b where A0 is of the following form: for each j ∈ {1, 2, ..., n} there is an ij ∈ {1, 2, ...., m} such that = Aij for i < ij , ∈ Aij , Aij for i = ij , (2.26) (A0 )ij = Aij for i > ij . By proposition 2.15, the system of linear inequalities, Ax1 − Ax2 ≤ b, x1 ≥ 0, x2 ≥ 0. does not have a solution. By Farkas’ lemma, there is a vector y ≥ 0 satisfying, T
Ay ≥ 0, ATy ≤ 0 and bT y < 0. For each j ∈ {1, 2, ..., n} and k ∈ {0, 1, ...., m} define a number tjk by X X yi Aij + yi Aij tjk = i≤k
Then
Qm
j j k=1 tk−1 tk
= tj0 tj1
j>k
tj1 tj2
tj2 tj3 ..... tjm−1 tjm
=
Qm−1 j 2 j j t0 tm k=1 tk
=
Qm−1 j 2 T Ay ATy j ≤ 0 k=1 tk j
hence there exists a k ∈ {1, 2, ...., m} satisfy tjk−1 tjk ≤ 0. But since ! tjk−1 tjk
=
X ik
! X ik
yi Aij
,
there exists an αj ∈ Akj , Akj (Recall: Intermediate Value Theorem), such that X
yi Aij + yk αj +
X
yi Aij = 0
(2.27)
i>k
i ij . (i = 1, 2, ..., m, j = 1, 2, ...., n), then A0 is of the form (2.26), A0 ∈ A and from (2.27) we have AT0 y = 0 which together with y ≥ 0, bTy < 0 implies [again use Farkas lemma] that the system of linear inequalities A0 x ≤ b does not have a solution. Remark 2.17. In the case of exact data, a system of linear equation Ax = b can be equivalently written as
A −A
x≤
(2.28)
b −b
(2.29)
hence any algorithm for checking solvability of (2.29) can be employed for checking solvability of (2.28). This is no more true in the case of inexact data. A system Ax = b cannot be equivalently written as A b x≤ −A −b
(2.30)
(2.31)
because of dependence of data in (2.29) which is not reflected in (2.31) where the same coefficient is allowed to take on different values within its two occurrences. Hence the solution set of (2.30) is always a part of that of (2.31) but the converse inclusion need not be true.
19
Chapter 3 An existence theorem for systems of linear equations In this chapter we give a constructive proof of the theorem 2.4, in chapter 2.
3.1
Introduction
Let Ax = b,
(3.1)
be a systems of linear equations with an m × n matrix A. Denote Ym = {y ∈ Rm ; |yi | = 1 for each i} , so that Ym consists of 2m elements, and for each y ∈ Ym , let Dy = diag (y1 , ....., ym ) i.e., (Dy )ii = yi for each i and (Dy )ij = 0 for i 6= j . Together with (3.1), we shall consider the family of systems of linear inequalities of the form Dy Ax ≤ Dy b (3.2) for all y ∈ Ym . Obviously, the ith inequality in (3.2) has the form (Ax)i ≤ bi if yi = 1 or (Ax)i ≥ bi if yi = −1 We shall prove: 20
Theorem 3.1. The system (3.1) has a (nonnegative) solution if and only if (3.2) has a (nonnegative) solution for each y ∈ Ym . Proof. Observe that “only if” part is obvious, since a solution of (3.1) also satisfies (3.2) for each y ∈ Ym . The “if” part is a consequence of the following theorem: Theorem 3.2. Let (3.2) have a solution xy for each y ∈ Ym . Then (3.1) has a solution which is a convex combination of the x0y s. We shall show that a solution of (3.1) can be constructed from x0y s algorithmically. Note that if all x0y s are non-negative then their convex combination is also a non-negative vector. For the description of the algorithm we shall need a special order of elements in Ym which is defined inductively via the sets Yj , j = 1, ..., m − 1, in the following way: (a) The order of Y1 is -1,1; (b) If y1 , ......., y2j is the order of Yj , then (y1 , −1) , ....., (y2j , −1) , (y1 , 1) , ...., (y2j , 1) is the order of Yj+1 . we additionally define Y0 = {1}. Here is a typical list of these vectors Y0 = {1} Y1 = {−1, 1} Y2 =
−1 −1
1 −1 1 , , , −1 1 1
Y3
1 −1 1 −1 1 −1 1 −1 −1 , −1 , 1 1 1 = , , −1 , −1 , , 1 −1 −1 −1 −1 1 1 1 1
Y4
−1 1 −1 −1 = −1 , −1 −1 −1
−1 1 −1 −1 −1 , −1 1 1
−1 1 1 1 , −1 , −1 −1 −1
−1 1 1 1 , −1 , −1 1 1
−1 1 −1 −1 , 1 , 1 −1 −1
−1 1 −1 −1 , 1 , 1 1 1 21
−1 1 1 1 , 1 , 1 −1 −1
−1 1 1 1 , 1 , 1 1 1
,
and so on. Further, for any sequence s1 , ....., s2h with an even number of elements, each pair sj , sj+h is called a conjugate pair, j = 1, ....., h. Remark 3.3. In above list of vectors, following pair of vectors are conjugate pairs. −1 −1 1 1 , −1 1 −1 1 −1 −1 1 1 −1 −1 1 1 −1 −1 , −1 −1 , 1 1 , 1 1 −1 1 −1 1 −1 1 −1 1 We, now, formulate the following “cancellation algorithm” for finding a solution to (3.1) from known solutions xy to (3.2), y ∈ Ym : Algorithm: T Step 1: Form a sequence of vectors xTy , (Axy − b)T ordered in the order of Ym . 0
Step 2: For each conjugate pair x, x in the current sequence compute 0 xk 0 if xk 6= xk , 0 xk − xk λ= 1 otherwise, where k is the index of the current last entry, and set 0
x = λx + (1 − λ) x . (3) Cancel the second part of the sequence and in the remaining part delete the last entry of each vector. (4) If there remains a single vector x terminate: Otherwise goto step 2. The algorithm and the previous theorems (3.1, 3.2) are justified by the following theorem. Theorem 3.4. The vector x obtained in step 3 of the algorithm satisfies Ax = b and x ∈ Conv {xy ; y ∈ Ym } . Proof. The algorithm starts with 2m vectors of dimension n + m and proceeds by halving the sequence and deleting the last entry, hence it is finite and at the end gives a single n-dimensional vector x. Consider an (n + j)-dimensional vector x˜ in a current step of the algorithm before updating. There are 2j such vectors. Let y, y ∈ Yj , be a vector which 22
occupies the same position in the order of Yj as x˜ in the current sequence. Denote that, x1 , ...., x˜n )T xjy = (˜ and ryj = (˜ xn+1 , ...., x˜n+j )T we shall prove that for each j = m, ...., 1, 0 and each y ∈ Yj there holds yi Axjy i ≤ yi bi , (i = 1, ......, j) (3.3) Axjy i = bi , (i = j + 1, ......, m) (3.4) j j (i = 1, ......, j) (3.5) ry i = Axy − b i , xjy ∈ X.
(3.6)
where X = conv {xy ; y ∈ Ym } . We apply induction on j = m, ...., 0. The case j = m is trivial since xm y = xy for each y ∈ Ym , hence (3.3) is equivalent to (3.2) and (3.5) follows from the initial construction in step 1. So assume equation (3.3) to (3.6) to hold for some j ∈ {1, ......, m} and each y ∈ Yj . Let y ∈ Yj−1 . Since, by the order of Yj , any two conjugate vectors in Yj differ only in the j th entry, xj−1 was constructed in step 2 by y xj−1 = λxj(y,−1) + (1 − λ) xj(y,1) y where, λ =
xj(y,1) xj(y,1) − xj(y,−1)
=
j r(y,1)
j
j j r(y,1) − r(y,−1)
Axj(y,1) − b j = j j Ax(y,1) − b − Ax(y,−1) − b Note that
Axj(y,1) − b
Axj(y,1)
−b
≤0 j
≥0 j
by (3.3). 23
Since (3.3) and (3.4) are satisfied by xj(y,−1) and xj(y,1) it is also satisfied by their convex combination xyj−1 i.e., yi Axyj−1 Axj−1 y
i
≤ y i bi ,
(i = 1, ....., j − 1)
= bi ,
i
(i = j + 1, ......, m)
holds provided the denominator in the equation of λ is nonzero. If Axj(y,−1) − b = j j Ax(y,1) − b , then the common value is both nonnegative and nonpositive, so j
that,
Axj(y,−1)
j
= bj =
Axj(y,1)
From the updating formula in step 2. We see that, j r(y,1) i λ= j j r(y,1) − r(y,−1) j
⇒
j r(y,1)
i
j
j
j j = λ r(y,1) − λ r(y,−1) j
j
so that, (3.5) also holds for j − 1. Since xj(y,−1) ∈ X and X is convex, we get that xj−1 ∈ X, thus completing the j induction. So, for j = 0, we obtain from (3.4), (3.6) that Ax0y = b, x0y ∈ X holds for single remaining n - dimensional vector x0y , which is equal to the above x from step 4. Example 3.5. x1 + x2 − x3 = 1
(3.7)
−2x1 + 3x2 + x3 = 2
(3.8)
Solution: There m = 2. So, Y1 = {−1, 1} Y2 =
y1 =
−1 −1
, y2 =
1 −1
, y3 =
The system (3.2) Dy Ax ≤ Dy b is for y1 , −x1 − x2 + x3 ≤ −1 24
−1 1
, y4 =
1 1
2x1 − 3x2 − x3 ≤ −2 for y2 , x1 + x2 − x3 ≤ 1 2x1 − 3x2 − x3 ≤ −2 for y3 , −x1 − x2 + x3 ≤ −1 −2x1 + 3x2 + x3 ≤ 2 for y4 , x1 + x2 − x3 ≤ 1 −2x1 + 3x2 + x3 ≤ 2 consider following solutions of (3.2) xy1 = x(−1,−1) = (0, 1, 0)T xy2 = x(1,−1) = (0, 0, 3)T xy3 = x(−1,1) = (2, 0, 0, )T xy4 = x(1,1) = (0, 0, 0, )T T T T Now, we form the sequence xy , (Axy − b) (0, 1, 0, 0, 1) , (0, 0, 3, −4, 1) (2, 0, 0, 1, −6) (0, 0, 0, −1, −2) conjugate pairs are (0, 1, 0, 0, 1) (2, 0, 0, 1, −6) (0, 0, 3, −4, 1) (0, 0, 0, 1, 2) for the first pair, the current last entries xk = 1, and hence λ=
x0k = −6
6 x0k = 0 xk − xk 7
so, x = λx + (1 − λ) x0 6 1 (0, 1, 0) + (2, 0, 0) 7 7 2 6 = , ,0 7 7
=
25
Similarly, for the conjugate pair (0, 0, 3, −4, 1) , (0, 0, 0, −1, −2) λ=
2 so, 3
2 1 (0, 0, 3) + (0, 0, 0) = (0, 0, 2) 3 3 T The new sequence xTy , (Ax − b)T is, x=
2 6 1 , , 0, 7 7 7
For this case
−3
λ=
−3 −
1 7
, (0, 0, 2, −3)
=
−3 21 = −22 22 7
and hence, 21 x = 22 =
2 6 1 , ,0 + (0, 0, 2) 7 7 22
3 9 1 , , 11 11 11
This is the solution of the origional system.
26
References [1] W. OETTLI AND W. PRAGER, Compatibility of approximate solution of linear equations with given error bounds for coefficients and right - hand sides, Numerische Mathematik, 6, 1964, 405 - 409. [2] JIRI. ROHN, Solvability of systems of linear interval equations, SIAM J. Matrix Analysis and Applications, Vol. 25, No.1, 2003, 237 - 245. [3] JIRI. ROHN, An existence theorem for systems of linear equations, Linear Multilinear Algebra, 29, 1991, 141 - 144. [4] JIRI. ROHN AND JANA. KRESLOVA, Linear interval inequalities, Linear Multilinear Algebra, 38, 1994, 79 - 82. [5] JIRI. ROHN, Systems of linear interval equations, Linear Algebra and its applications, 126, 1989, 39 - 78.
27