NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1–6 Prepared using nlaauth.cls [Version: 2002/09/18 v1.02]
On the Perturbation of the Q-factor of the QR Factorization X.-W. Chang∗ McGill University, School of Comptuer Science, 3480 University Street, McConnell Engineering Building, Room 318, Montreal, H3A 2A7, Canada
SUMMARY This paper gives normwise and componentwise perturbation analyses for the Q-factor of the QR factorization of the matrix A with full column rank when A suffers from an additive perturbation. Rigorous perturbation bounds are derived on the projections of the perturbation of the Q-factor in the range of A and its orthogonal complement. These bounds overcome a serious shortcoming of the first-order perturbation bounds in the literature and can be used safely. From these bounds, identical or equivalent first-order perturbation bounds in the literature can easily be derived. When A is square and nonsingular, tighter and simpler rigorous perturbation bounds on the perturbation of the Q-factor c 2000 John Wiley & Sons, Ltd. are presented. Copyright key words:
QR factorization, perturbation analysis
1. INTRODUCTION One of the important matrix factorizations in matrix computations is the QR factorization. If A ∈ Rm×n has full column rank, then it has a unique QR factorization A = QR, where Q ∈ Rm×n has orthonormal columns and R ∈ Rn×n is upper triangular with positive diagonal entries. The matrices Q and R are referred to as the Q-factor and the R-factor, respectively. Suppose that ∆A ∈ Rm×n is a small perturbation matrix such that A + ∆A has full column rank, then A + ∆A has a unique QR factorization A + ∆A = (Q + ∆Q)(R + ∆R), where Q + ∆Q has orthonormal columns and R + ∆R is upper triangular with positive diagonal entries. In the perturbation analysis of the QR factorization, one typically derives bounds on k∆Qk (or |∆Q|) and k∆Rk (or |∆R|) in terms of (a bound on) k∆Ak (or |∆A|), where for a matrix C = (cij ), k · k is a norm and |C| is defined by (|cij |). The perturbation analysis of the QR factorization has been extensively studied. Three types of analysis have been presented, each corresponding to different information on ∆A: 1. Given k∆AkF (or kPA ∆AkF and kPA⊥ ∆AkF , where PA and PA⊥ denotes the orthogonal projectors onto the range of A and its orthogonal complement, respectively), the first rigorous
∗ Correspondence
to:
[email protected] Contract/grant sponsor: The author’s research was supported by NSERC of Canada grant RGPIN217191-07.
c 2000 John Wiley & Sons, Ltd. Copyright
2
X.-W. CHANG
normwise perturbation bounds for the Q and R factors were presented by Stewart [8]. Then improved rigorous bounds and new first-order bounds were presented by Sun [10]. By a different approach Stewart [9] gave first-order bounds, which are similar to the first-order bounds of Sun [10]. Later Sun [12] gave new rigorous perturbation bounds for the Q-factor alone. The main contribution of [12] is that a first-order bound which can easily be obtained from one of the √ rigorous bounds improves the first-order bounds given in [10] and [9] by a factor of 1 + 1/ 2. This first-order bound was also obtained by Bhatia and Mukherjea [2] earlier by a different approach when A is square and nonsingular. In [1], the same first-order bound for the R-factor as that given by Sun [10] was derived when A is square and nonsingular. Using two different approaches, Chang et al [4] gave two new first-order perturbation bounds for the R-factor. Both can be arbitrarily smaller than the previous first-order bounds. One of those two bounds is expensive to estimate, but it is optimal, leading to the condition number for the R-factor, which is bounded above by a function of n when the standard column pivoting strategy is used in the QR factorization. To obtain tight results and see clearly where any ill conditioning lies, Chang et al [4] looked at how ∆Q is distributed between the range of Q (equivalently the range of A) and its orthogonal complement. Specifically two tight first-order bounds on kPA ∆QkF and kPA⊥ ∆QkF were given in [4]. Very recently, a new rigorous bound for the R-factor was given by Chang and Stehl´e [5], which is a small constant multiple of one of the first-order bounds given in [4]. 2. Given |∆A|, rigorous bounds on |∆Q| and |∆R| were given by Sun [11]. 3. Suppose |∆A| ≤ ǫC|A|, where 1 ≤ cij ≤ 1. An important motivation for considering such a class of perturbations is that the equivalent backward rounding error from a rounding error analysis of a standard QR factorization algorithm fits in this class. The first first-order normwise perturbation bounds for both the Q-factor and R-factor were presented by Zha [14]. Later Chang and Paige [3] presented rigorous perturbation bounds for both the Q-factor and R-factor, which are small constant multiples of Zha’s first-order bounds. For the R-factor, [3] also gave two new first-order perturbation bounds, which can be arbitrarily smaller than Zha’s first-order bound. For the Q factor, [3] gave nearly tight first-order bounds on kPA ∆QkF and kPA⊥ ∆QkF , respectively. Recently Chang et al [6] derived a new rigorous bound for the R-factor, which is a small constant multiple of one of the two first-order bounds given in [3]. In the above we mentioned that [4] and [3] gave first-order bounds on kPA ∆QkF and kPA⊥ ∆QkF for the first type and the third type of perturbation in A, respectively. Both [4] and [3] pointed out that the first-order bounds on kPA ∆QkF must be used carefully, since they may be much smaller than the actual kPA ∆QkF somtimes. In this paper we will give rigorous bounds on kPA ∆QkF to avoid this problem, so that the bounds can be used safely in all cases. The first-order bounds obtained from these rigorous bounds are the same or equivalent to those given in [4] and [3]. We will also present rigorous bounds on kPA⊥ ∆QkF . For the special case where A is square and nonsingular we will derive tighter bounds on kPA ∆QkF , which is then identical to k∆QkF . The results fill in gaps left in [5] and [6], which do not present any new rigorous perturbation bounds on the Q-factor. 2. NOTATION AND BASICS For any matrix X ∈ Rm×n , we define κ2 (X) = kXk2 kX † k2 , c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
cond2 (X) = k |X|·|X †| k2 , Numer. Linear Algebra Appl. 2000; 00:1–6
PERTURBATION OF THE Q-FACTOR OF THE QR FACTORIZATION
3
where X † is the Moore-Penrose pseudo-inverse of X. Let A ∈ Rm×n with full column rank have the QR factorization A = QR, then A† = R−1 QT ,
kAk2 = kRk2 , κ2 (A) = κ2 (R). m×(m−n) ¯ ¯ ∈ Rm×m is orthogonal. Then Let Q ∈ R be an orthonormal matrix such that [Q, Q] the orthogonal projectors PA and PA⊥ onto the range of A and its orthogonal complement respectively satisify ¯Q ¯T . PA = QQT , PA⊥ = Q It is easy to verify that for any matrix X ∈ Rm×k we have kPA Xkp = kQT Xkp and ¯ T Xkp for p = 2, F . For any matrix X = [Xn−1 , x] = (xij ) ∈ Rn×n , define kPA⊥ Xkp = kQ (see [4]) 1 x12 · x1n 2 x11 1 · x2n 2 x22 , low(X) ≡ [up(X T )]T . up(X) ≡ (1) · · 1 2 xnn It is easy to verify that low(X) − {low(X)}T = low([Xn−1 , 0]) − {low([Xn−1 , 0])}T , kup(X)kF ≤ kXkF , √ kup(X T + X)kF ≤ 2kXkF , 1 kup(X)kF ≤ √ kXkF , if X T = X, 2 √ kX − up(X + X T )kF = klow(X) − [low(X)]T kF ≤ 2kXkF .
(2) (3) (4) (5) (6)
Later when we derive perturbation bounds, we will use the following lemma, which was essentially proven in [4, Section 4]. i h r ∈ Rn×n and F = [Fn−1 , f ] ∈ Rn×n , Lemma 2.1. Given upper triangular R = Rn−1 0 rnn then √ √ √ −1 −1 −1 kF ≤ 2kFn−1 kF kRn−1 k2 ≤ 2kF kF kRn−1 k2 . klow(F R−1 )−[low(F R−1 )]T kF ≤ 2kFn−1 Rn−1 (7) The above bounds are tight, in that for any R, F can be chosen such that the equalities hold. −1 −1 Rn−1 −Rn−1 r/rnn Proof. Since F = [Fn−1 , f ] and R−1 = , from (2) 0 1/rnn T −1 −1 low(F R−1 ) − [low(F R−1 )]T = low [Fn−1 Rn−1 , 0] − low [Fn−1 Rn−1 , 0] . Therefore, by (6)
klow(F R−1 ) − [low(F R−1 )]T kF ≤
√ √ −1 −1 2kFn−1 Rn−1 kF ≤ 2kFn−1 kF kRn−1 k2 .
For any R, we can choose F such that the above inequalities become equalities. In fact, for any −T −1 R, take F = [en y T , 0], where en = [0, . . . , 0, 1]T ∈ Rn , y 6= 0 and kRn−1 yk2 = kRn−1 k2 kyk2 , then
−T √
0 −Rn−1 y
= 2kR−T yk2 klow(F R−1 ) − [low(F R−1 )]T kF = n−1
y T R−1
0 n−1 F √ √ √ −1 −1 −1 k2 kFn−1 kF = 2kRn−1 k2 kF kF . = 2kRn−1 k2 kyk2 = 2kRn−1 c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6
4
X.-W. CHANG
3. MAIN RESULTS We first consider the type 1 analysis mentioned in Section 1. Theorem 3.1. Let A = [An−1 , an ] ∈ Rm×n be of full column rank with QR factorization A = QR, and let ∆A = [∆An−1 , ∆an ] ∈ Rm×n be a perturbation in A. Let PA and PA⊥ be the orthogonal projectors onto the range of A and the orthogonal complement of the range of A, respectively. If k∆Ak2 < 1, (8) κ2 (A) kAk2
then A + ∆A has a unique QR factorization
A + ∆A = (Q + ∆Q)(R + ∆R), where kPA ∆QkF ≤
√ n−1 kF 2κ2 (An−1 ) kPAkA∆A n−1 k2
n−1 k2 1 − κ2 (An−1 ) k∆A kAn−1 k2 2 √ F 2κ2 (A) k∆Ak kAk2 + , 2 1 − κ2 (A) k∆Ak kAk2
and when m = n,
k∆QkF ≤
+
(9)
√ n−1 k2 2κ2 (An−1 ) k∆A kAn−1 k2
n−1 k2 1 − κ2 (An−1 ) k∆A kAn−1 k2
√ n−1 kF 2κ2 (An−1 ) k∆A kAn−1 k2
n−1 k2 1 − κ2 (An−1 ) k∆A kAn−1 k2
·
√ F 2κ2 (A) k∆Ak kAk2
2 1 − κ2 (A) k∆Ak kAk2
.
(10)
(11)
If the condition (8) is strengthened to (1 + then
√
2)κ2 (A)
k∆AkF < 1, kAk2
(12)
¯T
∆AkF κ2 (A) kQkAk 2 . kPA⊥ ∆QkF ≤ √ F 1 − (1 + 2)κ2 (A) k∆Ak kAk2
(13)
Proof. For t ∈ [0, 1], QT (A + t∆A) = R(I + tR−1 QT ∆A), where ktR−1 QT ∆Ak2 ≤ kA† k2 kQT ∆Ak2 < 1 by (8). Thus QT (A + t∆A) is nonsingular, and then A + t∆A has full column rank and has the unique QR factorization A + t∆A = Q(t)R(t), which, with Q(0) = Q, R(0) = R, Q(1) = Q + ∆Q and R(1) = R + ∆R, gives (9). Notice that
Z
Z 1
T Z t
1
T T ˙
= ˙ )dτ ˙ kQ ∆QkF = Q(t) − Q Q(t)dt Q(τ Q(t)dt
0 0 0 F F Z 1 2 Z 1 ˙ ˙ kQ(t)T Q(t)k ≤ kQ(t)k , F dt + F dt 0
c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
(14)
(15)
0
Numer. Linear Algebra Appl. 2000; 00:1–6
5
PERTURBATION OF THE Q-FACTOR OF THE QR FACTORIZATION
˙ where Q(t) is the derivative of Q(t) with respect to t. Thus, in order to bound kQT ∆QkF , in ˙ ˙ the following we derive bounds on kQ(t)T Q(t)k F and kQ(t)kF . From (14) we obtain (A + t∆A)T (A + t∆A) = R(t)T R(t). Differentiating both sides of this equation with respect to t leads to ˙ T R(t) + R(t)T R(t). ˙ ∆AT (A + t∆A) + (A + t∆A)T ∆A = R(t) Then, using (14) and multiplying both sides by R(t)−T from the left and by R(t)−1 from the right, we obtain −1 ˙ ˙ R(t)−T R(t) + R(t)R(t) = R(t)−T ∆AT Q(t) + Q(t)T ∆AR(t)−1 .
Therefore, with the ‘up’ notation defined in (1), −1 ˙ R(t)R(t) = up[R(t)−T ∆AT Q(t) + Q(t)T ∆AR(t)−1 ].
(16)
Differentiating both sides of (14) with respect to t and then multiplying both sides of the resulting equation by R(t)−1 leads to −1 ˙ ˙ Q(t) = ∆AR(t)−1 − Q(t)R(t)R(t) .
(17)
Substituting (16) into (17) and multiplying both sides of the resulting equation by Q(t)T from the left, we obtain ˙ Q(t)T Q(t) = Q(t)T ∆AR(t)−1 − up[R(t)−T ∆AT Q(t) + Q(t)T ∆AR(t)−1 ] = low(Q(t)T ∆AR(t)−1 ) − (low(Q(t)T ∆AR(t)−1 ))T .
(18) i Rn−1 (t) r(t) Let Q(t) and R(t) have the partitioning: Q(t) = [Qn−1 (t), qn (t)] and R(t) = 0 rnn (t) . i h r In particular, when t = 0, Q = [Qn−1 , qn ] and R = Rn−1 0 rnn . Taking the F-norm in (18) and applying Lemma 2.1, we get h
T −1 ˙ kQ(t)T Q(t)k ) − (low(Q(t)T ∆AR(t)−1 ))T kF F = klow(Q(t) ∆AR(t) √ ≤ 2kQ(t)T ∆An−1 Rn−1 (t)−1 kF √ ≤ 2kQ(t)T ∆An−1 kF kRn−1 (t)−1 k2 .
(19) (20)
¯ ¯ T from the left, we obtain Let [Q(t), Q(t)] ∈ Rm×m be orthogonal. Multiplying (17) by Q(t) ¯ T Q(t) ˙ ¯ T ∆AR(t)−1 . Q(t) = Q(t)
(21)
−1 ¯ T Q(t)k ˙ ¯ T ¯ T ∆AkF kR(t)−1 k2 . kQ(t) kF ≤ kQ(t) F = kQ(t) ∆AR(t)
(22)
Then it follows that
Since A + t∆A = Q(t)R(t), for t ∈ [0, 1], kR(t)−1 k2 =
1 kA† k2 1 ≤ . ≤ σmin (R(t)) σmin (A) − tk∆Ak2 1 − kA† k2 k∆Ak2
(23)
Similarly, since An−1 + t∆An−1 = Qn−1 (t)Rn−1 (t), we have kRn−1 (t)−1 k2 ≤ c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
kA†n−1 k2
1 − kA†n−1 k2 k∆An−1 k2
,
(24)
Numer. Linear Algebra Appl. 2000; 00:1–6
6
X.-W. CHANG
where kA†n−1 k2 k∆An−1 k2 ≤ kA† k2 k∆Ak2 < 1. Then, it follows from (19), (22) and (23) that q 2 2 + kQ(t) ˙ ˙ ˙ ¯ T Q(t)k kQ(t)k = kQ(t)T Q(t)k F F F q ¯ T ∆AR(t)−1 k2 ≤ 2kQ(t)T ∆AR(t)−1 k2F + kQ(t) F √ −1 (25) ≤ 2 k∆AR(t) kF √ 2 kA† k2 k∆AkF ≤ . (26) 1 − kA† k2 k∆Ak2 From (20), it follows by using (24) and (26) that
T √ Z t
−1 T ˙
˙ Q(τ )dτ + Q ∆An−1 kQ(t) Q(t)kF ≤ 2
kRn−1 (t) k2 0 F Z 1 √ ˙ )kF dτ kRn−1 (t)−1 k2 kQ(τ ≤ 2 kQT ∆An−1 kF + k∆An−1 k2 0 ! √ √ kA†n−1 k2 2k∆An−1 k2 kA† k2 k∆AkF T ≤ 2 kQ ∆An−1 kF + 1 − kA† k2 k∆Ak2 1 − kA†n−1 k2 k∆An−1 k2 √ √ √ T n−1 kF n−1 k2 F 2κ2 (An−1 ) kQkA∆A 2κ2 (An−1 ) k∆A 2κ2 (A) k∆Ak kAn−1 k2 kAk2 n−1 k2 . ≤ + · n−1 k2 n−1 k2 2 1 − κ2 (An−1 ) k∆A 1 − κ2 (An−1 ) k∆A 1 − κ2 (A) k∆Ak kAk2 kAn−1 k2 kAn−1 k2 (27) With (26) and (27), from (15) we can now conclude that (10) holds. When m = n, Q(t) ∈ Rn×n is orthogonal. Using (24), from (20) we obtain √ 2k∆An−1 kF kA†n−1 k2 ˙ kQ(t)k ≤ . F 1 − kA†n−1 k2 k∆An−1 k2 Then we have
Z
k∆QkF =
1 0
Z
˙ Q(t) dt ≤
0
F
1
˙ kQ(t)k F dt ≤
√ 2k∆An−1 kF kA†n−1 k2
1 − kA†n−1 k2 k∆An−1 k2
,
leading to (11). Now we want to show (13). Since
¯ T ∆QkF = kQ ¯T kQ R1
Z
1 0
˙ dtkF ≤ Q(t)
Z
1
0
¯ T Q(t)k ˙ kQ F dt,
(28)
¯T ¯ T Q(t)k ˙ kQ F dt in the following. Multiplying (17) by Q from the left, we obtain Z t −1 −1 ¯ T Q(t) ˙ =Q ¯ T ∆AR(t)−1 − Q ¯ T Q(t)R(t)R(t) ˙ ¯ T ∆AR(t)−1 − ¯ T Q(τ ˙ ) dτ R(t)R(t) ˙ Q =Q Q ,
we bound
0
0
¯ T Q(0) = 0. Then it follows that where we have used the fact that Q Z 1 T −1 −1 ¯ T Q(t)k ˙ ¯ ˙ ˙ )kF dτ. ¯ T Q(τ kQ ≤ k Q ∆AR(t) k + k R(t)R(t) k kQ F F F
(29)
0
c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6
PERTURBATION OF THE Q-FACTOR OF THE QR FACTORIZATION
7
−1 ˙ To bound kR(t)R(t) kF , we take the F-norm on both sides of (16) and apply (4), leading to √ −1 ˙ kR(t)R(t) kF ≤ 2kQ(t)T ∆AR(t)−1 kF . (30)
Then, from (29), (30) and (23) it follows Z
1
¯ T Q(t)k ˙ kQ F dt
0
≤
Z
1
¯ T ∆AR(t)−1 kF dt + kQ
0
Z
1
Z
0
1
Z √ T −1 2kQ(t) ∆AR(t) kF Z
1
¯ T Q(τ ˙ )kF dτ kQ
0
1 √ kR(t)−1 k2 dt + 2k∆AkF kR(t)−1 k2 dt 0 0 √ Z ¯ T ∆AkF kA† k2 2k∆AkF kA† k2 1 ¯ T ˙ kQ ≤ + kQ Q(τ )kF dτ. 1 − kA† k2 k∆Ak2 1 − kA† k2 k∆Ak2 0
¯ T ∆AkF ≤ kQ
Z
0
1
(31)
¯ T Q(τ ˙ )kF dτ kQ
Therefore, Z
0
1
¯ T Q(t)k ˙ kQ F dt ≤
1
¯ T ∆AkF kA† k2 kQ 1−kA† k2 k∆Ak2 √ 2kA† k2 k∆AkF − 1−kA † k k∆Ak 2 2
¯T
∆AkF κ2 (A) kQkAk 2 , ≤ √ F 1 − (1 + 2)κ2 (A) k∆Ak kAk2
where the denominator of the most right hand side is positive due to the condition (12). Thus from (28) we can conclude that (13) holds. In the following we make some remarks. Our computations were performed in MATLAB 7.10.0 on a MacBook running Mac OS X 10.5.8. Remark 3.1. From (10) we obtain the following first-order bounds on kPA ∆QkF : kPA ∆QkF .
√ √ kPA ∆AkF kPA ∆An−1 kF 2κ2 (An−1 ) ≤ 2kA†n−1 k2 kAk2 . kAn−1 k2 kAk2
(32)
The above second bound was given in [4] and proved tight. The first one was not explicitly given in [4], although it can be seen it holds from the proof given there. In the proof of Theorem 3.1, we used the second inequality in (7), while in [4] the third inequality in (7) was used. The first bound is independent of the perturbation in the last column of A, while the second bound depends on it. Thus the second bound can be significantly larger than the first bound. Here is an example: −10 1 0 10 −10−6 √ 10−6 , kPA ∆QkF ≈ 2 × 10−10 , A = 0 10−4 , ∆A = 10−10 0 0 10−10 0 √ √ kPA ∆AkF kPA ∆An−1 kF = 2 × 10−10 , ≈ 2 × 10−6 . 2κ2 (An−1 ) 2kA†n−1 k2 kAk2 kAn−1 k2 kAk2 Remark 3.2. When we use the first-order bounds on kPA ∆QkF in (32) we have to be careful. Notice that kPA ∆An−1 kF can be very small or even zero, but kPA ∆QkF may not be very small; c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6
8
X.-W. CHANG
see the following example: 1 0 A = 0 10−4 , 0 0
−10 10 ∆A = 10−10 10−10
−10−6 10−6 , 10−6
(1) √ kPA ∆An−1 kF = 2 × 10−10 , 2κ2 (An−1 ) kAn−1 k2
kPA ∆QkF ≈ 4.9 × 10−5 ,
bound (10) ≈ 6.2 × 10−4 .
This example indicates that sometimes the first-order bound may significantly underestimate the true perturbation and the rigorous bound (10) should be used instead. Remark 3.3. From (13) we obtain the following first-order bound on kPA⊥ ∆QkF presented in [4]: kP ⊥ ∆AkF . kPA⊥ ∆QkF . κ2 (A) A kAk2 Remark 3.4. From (26) we obtain the following rigorous bound on k∆AkF : √ Z 1 F 2κ2 (A) k∆Ak kAk2 ˙ kQ(t)kF dt ≤ k∆QkF ≤ . 2 1 − κ2 (A) k∆Ak 0 kAk2
(33)
This is the simplest and most often cited rigorous bound among a few rigorous bounds given in [12]. We now compare the bound in (11) with the bounds in (10) and (33) when A is square. Notice that in (10), when A is square, kPA ∆QkF = k∆QkF and kPA ∆An−1 kF = k∆An−1 kF . Thus the bound in (10) has two more terms than the bound in (11) and the latter is sharper than the former. If we compare (11) with (33), we can also observe that the bound in (11) is sharper. Here we can give an example to illustrate the comparisons. −10 √ 1 0 10 −10−6 A= , ∆A = , k∆QkF ≈ 2 × 10−10 , 0 10−4 10−10 10−6 √ k∆An−1 kF /kAn−1 k2 = k∆An−1 k2 /kAn−1 k2 = 2 × 10−10 , κ2 (An−1 ) = 1, √ k∆AkF /kAk2 ≈ k∆Ak2 /kAk2 ≈ 2 × 10−6 , κ2 (A) = 104 , bound (11) ≈ 2 × 10−10 ,
bound (10) ≈ 4.1 × 10−4 ,
bound (33) ≈ 2.0 × 10−2 .
Now we consider the type 3 analysis mentioned in Section 1. Theorem 3.2. Let A ∈ Rm×n be of full column rank with QR factorization i A = QR, where h r m×n the upper triangular matrix R ∈ Rn×n has the partitioning R = Rn−1 0 rnn . Let ∆A ∈ R be a perturbation matrix in A such that |∆A| ≤ ǫC|A|, C ∈ Rm×m , 0 ≤ cij ≤ 1, ǫ a small constant.
(34)
Let PA and PA⊥ be the orthogonal projectors onto the range of A and its orthogonal complement, respectively. Define η = kC|Q| kF ,
χ2 (R, D) = k |R|D−1 k2 kDR−1 k2 ,
c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
χ2 (R) = inf χ2 (R, D), D∈Dn
Numer. Linear Algebra Appl. 2000; 00:1–6
PERTURBATION OF THE Q-FACTOR OF THE QR FACTORIZATION
9
where Dn is the set of all n × n positive diagonal entries. If ηχ2 (R)ǫ < 1,
(35)
then A + ∆A has a unique QR factorization A + ∆A = (Q + ∆Q)(R + ∆R), where
√ 2ηχ2 (Rn−1 )ǫ + kPA ∆QkF ≤ 1 − ηχ2 (Rn−1 )ǫ
and if m = n, k∆QkF ≤
(36)
!2 √ 2ηχ2 (R)ǫ , 1 − ηχ2 (R)ǫ
√ 2η χ2 (Rn−1 )ǫ . 1 − η χ2 (Rn−1 )ǫ
(37)
(38)
If the condition (35) is strengthened to (1 +
√
2)ηχ2 (R)ǫ < 1,
(39)
then kPA⊥ ∆QkF ≤
ηχ2 (R)ǫ √ . 1 − (1 + 2)ηχ2 (R)ǫ
(40)
Proof. We will use the proof for Theorem 3.1. Since (35) holds, there exists D ∈ Dn such that ηχ2 (R, D)ǫ < 1. Let D = diag(Dn−1 , dn ) ∈ Dn with Dn−1 ∈ Dn−1 . Then it is easy to verify that ηχ(Rn−1 , Dn−1 )ǫ < 1. For t ∈ [0, 1], QT (A + t∆A) = (I + tQT ∆AR−1 )R. But kQT ∆AR−1 kF ≤ k∆AD−1 k2 kDR−1 k2 ≤ kC|Q| kF k |R|D−1 k2 kDR−1 k2 ǫ = ηχ2 (R, D)ǫ < 1. Thus QT (A + t∆A) is nonsingular, and then A + t∆A has full column rank and has the unique QR factorization A + t∆A = Q(t)R(t), (41) which, with Q(0) = Q, R(0) = R, Q(1) = Q + ∆Q and R(1) = R + ∆R, gives (36). Now we derive two inequalities which will be used a few times later. Since A + t∆A = Q(t)R(t), we have QRD−1 + t∆AD−1 = Q(t)R(t)D−1 . Therefore, for t ∈ [0, 1], kDR(t)−1 k2 =
1 kDR−1 k2 1 ≤ ≤ . −1 −1 −1 σmin (R(t)D ) σmin (RD ) − tk∆AD k2 1 − ηk |R|D−1 k2 kDR−1 k2 ǫ (42)
Then, k∆AR(t)−1 kF ≤ kC|Q| kF k |R|D−1 k2 kDR(t)−1 k2 ≤
ηχ2 (R, D)ǫ . 1 − ηχ2 (R, D)ǫ
Since D ∈ Dn is arbitrary, we must have k∆AR(t)−1 kF ≤ c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
ηχ2 (R)ǫ . 1 − ηχ2 (R)ǫ
(43)
Numer. Linear Algebra Appl. 2000; 00:1–6
10
X.-W. CHANG
From (34) it follows that |∆An−1 | ≤ ǫC|Qn−1 ||Rn−1 |. Notice that kC|Qn−1 | kF ≤ kC|Q| kF = η. By an argument analogous to that for (43), we can obtain k∆An−1 Rn−1 (t)−1 kF ≤
ηχ2 (Rn−1 )ǫ . 1 − ηχ2 (Rn−1 )ǫ
(44)
From (25) and (43) we obtain ˙ kQ(t)k F ≤
√ 2k∆AR(t)−1 kF ≤
√ 2ηχ2 (R)ǫ . 1 − ηχ2 (R)ǫ
(45)
From (19) and (44), we obtain ˙ kQ(t)T Q(t)k F ≤
√ 2k∆An−1 Rn−1 (t)−1 kF ≤
√ 2ηχ2 (Rn−1 )ǫ . 1 − ηχ2 (Rn−1 )ǫ
(46)
With (45) and (46), from (15) we can conclude that (37) holds. When m = n, (46) becomes √ 2ηχ2 (Rn−1 )ǫ ˙ . kQ(t)kF ≤ 1 − ηχ2 (Rn−1 )ǫ
R1 ˙ Then from k∆QkF ≤ 0 kQ(t)k F dt, we can conclude that (38) holds. Now we prove (40). From (31) we obtain Z
0
1
Z
Z 1√ Z 1 ¯ T Q(τ ˙ )kF dτ k∆AR(t)−1 kF + 2k∆AR(t)−1 kF kQ 0 0 0 √ Z 2ηχ2 (R)ǫ 1 ¯ T ˙ ηχ2 (R)ǫ + kQ Q(τ )kF dτ, ≤ 1 − ηχ2 (R)ǫ 1 − ηχ2 (R)ǫ 0
¯ T Q(t)k ˙ kQ F ≤
1
where in proving the second inequality we used (43). Therefore, ¯ T ∆QkF ≤ kQ
Z
0
1
¯ T Q(t)k ˙ kQ F dt ≤
1
ηχ2 (R)ǫ 1−ηχ2 (R)ǫ √ 2ηχ2 (R)ǫ − 1−ηχ 2 (R)ǫ
=
ηχ2 (R)ǫ √ , 1 − (1 + 2)ηχ2 (R)ǫ
where the denominator of the most right hand side is positive due to the condition (39). Then we can conclude that (40) holds. In the following we make some remarks. Remark 3.5. To estimate χ2 (R) in (37), we can use the following result (see van der Sluis [13]): √ χ2 (R, Dc ) = k|R|Dc−1 k2 kDc R−1 k2 ≤ nχ2 (R), where Dc = diag(kR(:, j)k2 ) ∈ Dn . Given R, we use a norm estimator (see, e.g., [7, Chap. 14]) to estimate χ2 (R, Dc ), which can usually be done in O(n2 ) flops, and then use the estimated value as an approximation to χ2 (R). Similarly we can use this approach to estimate χ2 (Rn−1 ). c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6
PERTURBATION OF THE Q-FACTOR OF THE QR FACTORIZATION
11
Remark 3.6. From (37) and (40), we obtain the following first-order perturbation bounds: √ kPA ∆QkF . 2ηχ2 (Rn−1 )ǫ, kPA⊥ ∆QkF . ηχ2 (R)ǫ, (47) while the first-order bounds given in [3] are as follows: √ kPA ∆QkF . 2η1 cond2 (Rn−1 )ǫ, kPA⊥ ∆QkF . η2 cond2 (R)ǫ,
(48)
¯T
T
where η1 = k |Q |C|Q| kF and η2 = k |Q |C|Q| kF . Now we look at the relation between cond2 (R) and χ2 (R). On one hand, for any D ∈ Dn , we have √ √ cond2 (R) ≤ k |R|D−1 k2 kD|R−1 | k2 ≤ nk |R|D−1 k2 kDR−1 k2 = nχ2 (R, D). √ Thus cond2 (R) ≤ nχ2 (R). On the other hand, k |R|D−1 k2 kDR−1 k2 ≤ nk |R|D−1 k∞ kD|R−1 | k∞ ,
Thus √ χ2 (R) ≤ inf nk |R|D−1 k∞ kD|R−1 | k∞ = nk |R||R−1 | k∞ ≤ n n cond2 (R), D∈D
where the equality is due to a general result proved by van der Sluis [13]. Notice that η, η1 and η2 can be bounded by functions in terms of m and n. Therefore, there is no essential difference between the first-order bounds given in (47) and the first-order bounds given in (48). Remark 3.7. From (45) we obtain the following rigorous perturbation bound on k∆QkF : √ Z 1 2ηχ2 (R)ǫ ˙ . (49) k∆QkF ≤ kQ(t)k dt ≤ F 1 − ηχ2 (R)ǫ 0 This bound is similar to the following rigorous bound given in [3]: √ √ (50) k∆QkF ≤ (1 + 2 + 3)η cond2 (R)ǫ, p −1 under the condition that k∆AR kF < 3/2 − 1. Here the condition may not be practical as often ∆A is unknown. Since |∆A| ≤ ǫC|A|, k∆ARp−1 kF = k |∆AR−1 | kF ≤ η cond2 (R)ǫ. So a more practical condition would be η cond2 (R)ǫ < 3/2 − 1. This condition is more restrictive than ηχ2 (R)ǫ < 1 required by (49) and it can easily be verified that the bound in (50) is slightly less tighter than the bound in (49), if we ignore the difference between cond2 (R) and χ2 (R). When A is square, both bounds in (49) and (50) can be significantly larger than the bound in (38). Here is an example: 1 105 R= , cond2 (R) ≈ 2 × 105 , χ2 (R) ≤ χ2 (R, Dc ) ≈ 2 × 105 , χ2 (Rn−1,n−1 ) = 1, 0 1 where Dc is defined in Remark 3.5. Remark 3.8. Suppose that Qc Rc is the computed QR factorization of A obtained via Householder transformations. Higham [7, Theorem 18.4]) showed e c, A + ∆A = QR
|∆A| ≤ ǫ C|A|,
ǫ = γm,n u,
(51)
eT Q e = I, γm,n is a moderate constant depending on m and n, u is the unit where Q e + ∆Qc , where roundoff, C ≥ 0 and kCkF = 1. Also the computed Qc satisfies Qc = Q
c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6
12
X.-W. CHANG
e with C2 ≥ 0, kC2 kF = 1. Let ∆Q = Q e − Q. Since Qc − Q = ∆Qc + ∆Q |∆Qc | ≤ γm,n uC2 |Q| we have kPA (Qc − Q)kF ≤ n1/2 γm,n u + kPA ∆QkF . (52) In [3] the following example was used to illustrate that the first-older bound on kPA ∆QkF in (48) can severally underestimate the true value of kPA ∆QkF : √ √ √ 2/2 0 1 1 2 2 , 1 , R = A = 0 10−10 , Q = √ 0 0 10−10 1 1 2/2 0 7.0711e−01 7.8508e−07 0 1.0000e+00 , kPA (Qc − Q)kF ≈ 6.2 × 10−13 , Qc = 7.0711e−01 −7.8508e−07 χ2 (Rn−1 )ǫ = cond2 (Rn−1 )ǫ ≈ O(10−16 ),
bound (37) ≈ O(10−11 ).
We see that the rigorous bound on kPA ∆QkF given in (37) should be used when we bound kPA (Qc − Q)kF in (52). 4. CONCLUDING REMARKS We have derived rigorous perturbation bounds on projections of the perturbation of the Qfactor of the QR factorization of full column rank A in the range of A and its orthogonal complement for both normwise perturbation and componentwise perturbation in A. These bounds, unlike the first-order perturbation bounds in the literature, can be used safely. From these bounds, identical or equivalent first-order perturbation bounds in the literature can easily be derived. The results filled gaps left in the literature. When A is square and nonsingular, tighter and simpler rigorous perturbation bounds on the perturbation of the Q-factor were presented. There are still some questions which need to be investigated further. In Theorems 3.1 and 3.2, the conditions (12) and (39) for the bounds on kPA⊥ ∆QkF to hold are slightly stronger than the conditions (8) and (35) for the bounds on kPA ∆QkF to hold, respectively. It would be interesting to investigate if some new bounds on kPA⊥ ∆QkF can be derived under the conditions (8) and (35). This is for a theoretical purpose, not for a practical purpose. For the special case that A is square and nonsingular, our tighter and simpler rigorous bounds on k∆QkF in (11) and (38) were derived separately. It would be more elegant if they could be derived from the general bounds. ACKNOWLEDGEMENT
The author is grateful to Chris Paige for his valuable suggestions.
REFERENCES 1. R. Bhatia, Matrix factorizations and their perturbations, Linear Algebra and Appl., 197, 198 (1994), pp. 245-276. c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6
PERTURBATION OF THE Q-FACTOR OF THE QR FACTORIZATION
13
2. R. Bhatia and K. K. Mukherjea, Variation of the unitary part of a matrix, SIAM J. Matrix Anal. Appl., 15 (1994), pp. 1007–1014. 3. X.-W. Chang and C. C. Paige, Componentwise perturbation analyses for the QR factorization, Numerische Mathematik 88 (2001), pp. 319–345. 4. X.-W. Chang, C. Paige, and G. Stewart, Perturbation analyses for the QR factorization, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 775–791. 5. X.-W. Chang and D. Stehl´ e, Rigorous perturbation bounds of some matrix factorizations, SIAM J. Matrix Anal. Appl., 31 (2010), pp. 2841–2859. 6. X.-W. Chang, D. Stehl´ e, and G. Villard, Perturbation analysis of the QR Factor R in the context of LLL lattice basis reduction, to appear in Mathematics of Computation, 25 pages. 7. N. J. Higham, Accuracy and Stability of Numerical Algorithms, 1st ed., Society for Industrial and Applied Mathematics, Philadelphia, PA, 1996. 8. G. W. Stewart, Perturbation bounds for the QR factorization of a matrix, SIAM J. Numer. Anal., 14 (1977), pp. 509–518. 9. G. W. Stewart, On the perturbation of LU, Cholesky, and QR factorizations, SIAM J. Matrix Anal. Appl., 14 (1993), pp. 1141–1146. 10. J.-G. Sun, Perturbation bounds for the Cholesky and QR factorization, BIT, 31 (1991), pp. 341–352. 11. J.-G. Sun, Componentwise perturbation bounds for some matrix decompositions, BIT, 32 (1992), pp. 702– 714. 12. J.-G. Sun, On perturbation bounds for the QR factorization, Linear Algebra and Appl., 215 (1995), pp. 95–112. 13. A. van der Sluis, Condition numbers and equilibration of matrices, Numerische Mathematik, 14 (1969), pp. 14–23. 14. H. Zha, A componentwise perturbation analysis of the QR decomposition, SIAM J. Matrix Anal. Appl., 14 (1993), pp. 1124–1131.
c 2000 John Wiley & Sons, Ltd. Copyright Prepared using nlaauth.cls
Numer. Linear Algebra Appl. 2000; 00:1–6