NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. (2010) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nla.702
Condition numbers and perturbation analysis for the Tikhonov regularization of discrete ill-posed problems Delin Chu1 , Lijing Lin2, ‡ , Roger C. E. Tan1 and Yimin Wei3, 4, ∗, † 1 Department
of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543, Singapore of Mathematical Science, Fudan University, Shanghai, 200433, People’s Republic of China 3 School of Mathematical Sciences, Fudan University, Shanghai 200433, People’s Republic of China 4 Shanghai Key Laboratory of Contemporary Applied Mathematics, Fudan University, Shanghai 200433, People’s Republic of China 2 Institute
SUMMARY One of the most successful methods for solving the least-squares problem minx Ax −b2 with a highly ill-conditioned or rank deficient coefficient matrix A is the method of Tikhonov regularization. In this paper, we derive the normwise, mixed and componentwise condition numbers and componentwise perturbation bounds for the Tikhonov regularization. Our results are sharper than the known results. Some numerical examples are given to illustrate our results. Copyright q 2010 John Wiley & Sons, Ltd. Received 9 December 2008; Revised 8 December 2009; Accepted 13 December 2009 KEY WORDS:
linear least squares; condition number; Tikhonov regularization; perturbation
1. INTRODUCTION The well-known Tikhonov regularization technique [1] in numerical analysis has been investigated extensively. Some recent developments can be found in [2–12]. The key idea in Tikhonov regularization is to incorporate a priori assumptions about the size and smoothness of the desired solution, in the form of the smoothing function ( f ) for the continuous case, and the (semi) norm Lx2 ∗ Correspondence
to: Yimin Wei, School of Mathematical Sciences, Fudan University, Shanghai 200433, People’s Republic of China. † E-mail:
[email protected], ymwei
[email protected],
[email protected] ‡ Current address: School of Mathematics, The University of Manchester, Manchester M13 9PL, U.K. Contract/grant sponsor: NUS Research; contract/grant number: R-146-000-087-112 Contract/grant sponsor: National Natural Science Foundation of China; contract/grant number: 10871051 Contract/grant sponsor: Shanghai Education Committee (Dawn Project), Shanghai Science and Technology Committee; contract/grant numbers: 09DZ2272900, KLMM0901 Contract/grant sponsor: Doctoral Program of the Ministry of Education; contract/grant number: 20090071110003 Copyright q
2010 John Wiley & Sons, Ltd.
D. CHU ET AL.
for the discrete case. For discrete ill-posed problems in mechanical systems and signal processing, the Tikhonov regularization in general form leads to the regularized minimization problem min {Ax −b22 +2 Lx22 }, x
A ∈ Rm×n ,
L ∈ R p×n ,
(1)
where the regularization parameter controls the weight given to the minimization of Lx2 relative to the minimization of the residual Ax −b2 . The matrix L is typically the identity matrix In or a discrete approximation to some derivation operator. As usual, for the regularization problem (1), it is always assumed that rank(L) = pnm and A rank =n L (cf. [13, Section 5]). Thus, the problem (1) has a unique solution for any >0. The regularization problem (1) can also be written as: A b min (2) x− . x L 0 2
Since the normal equation of (2) is (AT A +2 L T L)x = AT b,
(3)
we obtain the explicit expression for the regularized solution immediately, x = (AT A +2 L T L)−1 AT b. Clearly, the perturbed counterpart of the problem (1) and its normal equation (3) are, respectively, min {(A +A)(x +x)−(b +b)22 +2 L(x +x)22 },
(4)
((A +A)T (A +A)+2 L T L)(x +x) = (A +A)T (b +b),
(5)
x+ x
and where A and b are perturbations to A and b, respectively, and A +A rank = n. L For the Tikhonov regularization, Malyshev [14] studied the condition numbers x2 A F x2 b2 F A→x = lim sup , b→x = lim sup , →0 A F x 2 →0 b2 x 2 A2 b2 and proved that FA→x
⎡ rr T I − r 2 ⎢ r 22 I ⎣ = A2 (AT A +I )−1 −AT x 2 0
b→x = Copyright q
(AT A +I )−1 AT x 2
2010 John Wiley & Sons, Ltd.
2 b2
,
⎤ r x T ⎥ r 2 x 2 ⎦ , I
(6)
2
(7) Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
CONDITION NUMBERS OF TIKHONOV REGULARIZATION
where r = b − Ax . However, relative normwise condition numbers in (6) and (7) do not take into account the structures of A and b with respect to the scaling and/or sparsity. When A and/or b are badly scaled or contain many zeros, the size of a perturbation in terms of its norm cannot reflect the relative size of the perturbation on its small (or zero) entries. Motivated by this observation, a different approach known as componentwise analysis [15, 16] in the perturbation theory arises. In componentwise analysis, two different condition numbers are considered: the first one measures the errors in the output using norms and the input perturbations componentwise, and the second one measures both the errors in the output and the perturbations in the input componentwise. The resulting condition numbers are called mixed and componentwise by Gohberg and Koltracht [17]. We will use this terminology throughout this paper. In [18] Skeel performed a mixed perturbation analysis for the nonsingular linear system of equation and a mixed error analysis for Gaussian elimination. Rohn [19] developed the componentwise condition numbers for matrix inversion and nonsingular linear system. Gohberg and Koltracht [17] presented explicit expressions for both mixed and componentwise condition numbers, for differentiable functions as well as the linear systems. The componentwise perturbation analysis and the mixed and componentwise condition numbers for the linear least squares problems can be found in [15, 20–24]. To the best of our knowledge, there is no mixed and componentwise perturbation analysis available for the Tikhonov regularization. Our paper is to fill in this gap. We will derive the normwise, mixed and componentwise condition numbers and componentwise perturbation bounds for the Tikhonov regularization. Our normwise error bound is sharper than the known results. The present paper is organized as follows. In Section 2, we derive computable formulae for the normwise, mixed and componentwise condition number for the Tikhonov regularization. Condition numbers for the residue r = b − Ax are also considered. We also give sharp bounds for unrestricted (i.e. not necessarily small) perturbations. We study the augmented system for the Tikhonov regularization in Section 3, which extends the results in [20, 23]. Finally, we compare our results with the known results by numerical examples. Throughout this paper, • Let A ∈ Rm×n and B ∈ R p×q , the Kronecker product A ⊗ B ∈ Rmp×nq is defined by: ⎡
a11 B
⎢ ⎢ a21 B ⎢ A⊗ B =⎢ . ⎢ . ⎣ . am1 B
a12 B
...
a22 B
...
.. .
..
am2 B
.
a1n B
⎤
⎥ a2n B ⎥ ⎥ ; .. ⎥ ⎥ . ⎦
. . . amn B
• For any matrix A = [a1 a2 . . . an ] ∈ Rm×n , vec(A) is defined by vec(A) = [a1T a2T . . . anT ]T ; • For any matrix A = (aij ) ∈ Rm×n , |A| is defined by their absolute value |A| = (|aij |) ∈ Rm×n ; A† ∈ Rn×m is the Moore–Penrose inverse of A [25, 26]; • A 2 denotes the spectral norm of A; A F represents the Frobenius norm of A and A F = trace(AT A); A∞ is the infinity norm of A; • [A b] is the augmented matrix of A and b; (A) denotes the spectral radius of A; Copyright q
2010 John Wiley & Sons, Ltd.
Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
D. CHU ET AL.
•
=
where (n) ej
m n
E ij (m ×n)⊗ E j i (n ×m), i=1 j =1 (m) (n) E ij (m ×n) = ei (e j )T ∈ Rm×n denotes the (i, j )th elementary
(m)
matrix, and ei
and
are the ith column and j th column of identity matrices Im ∈ Rm×m and In ∈ Rn×n , respectively. The matrix is called the vec-permutation matrix.
With the notation above, the following results hold: vec(A X B) = (B T ⊗ A)vec(X ),
|A ⊗ B| = |A|⊗|B|, vec(A) = vec(AT ),
(y T ⊗Y ) = Y ⊗ y T .
2. NORMWISE, MIXED AND COMPONENTWISE CONDITION NUMBERS FOR THE TIKHONOV REGULARIZATION In this section, first we derive normwise perturbation bounds and the condition numbers for the solutions of Tikhonov regularization and proceed to the mixed and componentwise perturbation analysis. In many applications the error in b can be larger than the error in A. Moreover, A and b do not have to be of the same magnitude. To get a more general analysis and flexible result, which allows different scalings of A and b, we take advantage of the weighted Frobenius norm of the data A and b defined by [27] [AT, b] F , where T is a positive definite diagonal matrix and >0. Definition 1 Let x and x +x be the unique solutions of the problems (1) and (4), respectively. Then the normwise, mixed and componentwise condition numbers of Tikhonov regularization are defined by F condReg := lim
sup
→0 [( A)T ,(b)] F [AT ,b] F
m Reg := lim sup
→0 | A|| A| |b||b|
cReg := lim sup
→0 | A|| A| |b||b|
x2 , x 2
x∞ , x ∞ 1 x , x ∞
where b/a is an entry-wise division, that is, b/a := (bi /ai ), or b./a in MATLAB notation, and /0 is interpreted as zero if = 0 and ∞ otherwise. Copyright q
2010 John Wiley & Sons, Ltd.
Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
CONDITION NUMBERS OF TIKHONOV REGULARIZATION
In the above definition, we study the general case when both the coefficient matrix A and the right-hand side b are perturbed, which is an extension to (6) and (7) by Malyshev [14], who only considered the perturbation on A or b. Theorem 2 † Let DT = T ⊗ I and Dx be the Moore–Penrose inverse of Dx = diag(x ). Denote r = b − Ax ,
M = (AT A +L T L)−1 AT ,
H = −x T ⊗((AT A +L T L)−1 AT )+(AT A +L T L)−1 ⊗r T . (i) Assume that [(A)T (b)] F [AT b] F . If is small enough, then x2 [H DT−1 −1 M]2 [(A)T (b)] F , x 2 x 2
(8)
where the symbol ‘’ means that the expression on the left-hand side is ‘less than or similar to’ the right-hand side. (ii) [H DT−1 −1 M]2 [AT b] F , x 2 |H |vec(|A|)+|M| |b|∞ m Reg = , x ∞
F condReg =
cReg = |Dx† ||H |vec(|A|)+|Dx† | |M| |b|∞ .
(9) (10) (11)
Proof We shall use (3) and (5) that x = (AT A +2 L T L)−1 (AT b +(A)T (b − Ax )− AT Ax )+O(2 )
(12)
provided that [(A)T (b)] F O(). In the following we will show that (8) and (9) hold. By omitting the term O(2 ) in (12) we have x = (AT A +2 L T L)−1 (AT b +(A)T (b − Ax )− AT Ax ),
(13)
which leads to x = [−x T ⊗((AT A +L T L)−1 AT )+(AT A +L T L)−1 ⊗r T (AT A +L T L)−1 AT ] vec(A) × b DT vec(A) −1 −1 = [H DT M] . (b) Copyright q
2010 John Wiley & Sons, Ltd.
(14)
Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
D. CHU ET AL.
Thus, a simple calculation yields x2 [H DT−1 −1 M]2 DT vec(A) x 2 x 2 (b) [H DT−1
−1
M x 2
=
]2
2
[(A)T (b)] F
[H DT−1 M−1 ]2 [AT b] F . x 2
Hence, Part (i) is true. Furthermore, the upper bound above is attainable according to the property of 2-norm. Consequently, (9) holds. Next, we will prove (10). According to | A||A|, we know that if aij = 0, then aij = 0, i.e. the zero elements of A are not permitted to be perturbed. Therefore, †
vec(A) = D A D A vec(A),
(15)
†
where D A = diag(vec(A)) and D A is the Moore–Penrose inverse of D A . We can rewrite (14) as ⎤ ⎡ † D A vec(A) ⎦, x = [H D A M Db ] ⎣ † Db (b)
(16)
where Db = diag(b). Taking the infinity norm and using the assumption in the definition of m Reg , we have ⎡ ⎤ † D A vec(A) ⎦ ⎣ x∞ [H D A M Db ]∞ (17) † Db (b) ∞
[H D A M Db ]∞ .
(18)
The equality is attainable. According to the definition of ·∞ , there exists y0 with y0 ∞ = 1 such that [H D A M Db ]∞ = max [H D A M Db ]y∞ = [H D A M Db ]y0 ∞ . y∞ =1
Let
vec(A) b
=
DA Db
y0 .
Then (18) becomes an equality. This results in m Reg = Copyright q
2010 John Wiley & Sons, Ltd.
[H D A M Db ]∞ . x ∞
(19) Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
CONDITION NUMBERS OF TIKHONOV REGULARIZATION
It is easy to verify that [H D A M Db ]∞ = [|H D A | |M Db |]e∞ = |H D A |e +|M Db |e∞ = |H |vec(|A|)+|M| |b|∞ , e = [1, 1, . . ., 1]T .
where Finally, we prove (11). From the definition of cReg , if (x )i = 0 but x i = 0, then cReg = ∞, otherwise, reformulate (14) as ⎡ ⎤ † D vec(A) A ⎦. Dx† x = [Dx† H D A Dx† M Db ] ⎣ (20) † Db (b) Then we obtain cReg = [Dx† H D A Dx† M Db ]∞ = [|Dx† H D A | |Dx† M Db |]e∞ = [|Dx† H D A |e +|Dx† M Db |e]∞ = |Dx† ||H |vec(|A|)+|Dx† ||M||b|∞ .
Condition numbers bound the worst-case sensitivity of solution of the problem to small perturbations in the input data [16, 28]. If is the size of the perturbation, then a term O(2 ) is neglected and the bound only holds for small enough. In this sense, condition numbers are first-order bounds for these sensitivities. The following result exhibits unrestricted perturbation bounds for the Tikhonov regularization. Theorem 3 Let A ∈ Rm×n , b ∈ Rm be the perturbations to A and b, respectively. Let x and y := x + x be the solutions of normal equations (3) and (5), respectively. If for some non-negative E ∈ Rm×n and f ∈ Rm we have | A|E and |b| f , then y − x ∞ |Q|vec(E)+|S| f ∞ x ∞ x ∞
(21)
where Q = −yT ⊗((AT A +2 L T L)−1 AT )+(AT A +2 L T L)−1 ⊗s T , S = (AT A +2 L T L)−1 AT ,
s = b +b −(A +A)y .
Proof Substituting (3) into (5), we have (AT A +2 L T L)(x − y ) = −(A)T ((b +b)−(A +A)y )− AT (b −Ay ) = −(A)T s − AT (b)+ AT (A)y . Copyright q
2010 John Wiley & Sons, Ltd.
Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
D. CHU ET AL.
Hence, y − x = (AT A +2 L T L)−1 (−AT Ay +(A)T s + AT b) = [−yT ⊗((AT A +2 L T L)−1 AT )+(AT A +2 L T L)−1 ⊗s T (AT A +2 L T L)−1 AT ] ×
vec(A) b
.
Let D E = diag(vec(E)), D f = diag( f ). We have y − x = [Q S] which gives
DE
DE y − x ∞ [Q S]
⎤ ⎡ † D E vec(A) ⎦, ⎣ † Df D f (b)
(22)
= |Q|vec(E)+|S| f ∞ . Df ∞
Thus, the inequality (21) follows.
By the same argument as in the proof of Theorem 2, the error bound (21) is attainable. In this sense, the inequality (21) is optimal. In what follows we assume that b ∈ / R(A), where R(A) denotes the range of A. That is, the residual vector r = b − Ax = 0, where x is the solution of the problem (1). Next we introduce the mixed and componentwise condition numbers for the residual vector r . Definition 4 The mixed and componentwise condition numbers for the residual vector are, respectively, given by: m res := lim sup →0
| A|| A| |b||b|
cres := lim sup →0
| A|| A| |b||b|
r ∞ , r ∞ 1 r . r ∞
The following theorem presents the explicit expressions of the mixed and componentwise condition numbers m res and cres . Theorem 5 m res =
|U |vec(|A|)+|V | |b|∞ , x ∞
(23)
cres = |Dr† ||U |vec(|A|)+|Dr† ||V ||b|∞ , Copyright q
2010 John Wiley & Sons, Ltd.
(24)
Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
CONDITION NUMBERS OF TIKHONOV REGULARIZATION
where r = b − Ax , U = −x T ⊗(I − A(AT A +2 L T L)−1 AT )−(A(AT A +2 L T L)−1 )⊗r T , V = I − A(AT A +2 L T L)−1 AT , †
and Dr is the Moore–Penrose inverse of Dr = diag(r ). Proof Let s = b +b −(A +A)y where y is the solution of the problem (5). Omitting the higher-order terms, the perturbation of the residual vector r satisfies r := s −r
(25)
= b +b −(A +A)y −(b − Ax ) = −(I − A(AT A +2 L T L)−1 AT )(A)x − A(AT A +2 L T L)−1 (A)Tr +(I − A(AT A +2 L T L)−1 AT )b = [−x T ⊗(I − A(AT A +2 L T L)−1 AT )−(A(AT A +2 L T L)−1 )⊗r T , vec(A) I − A(AT A +2 L T L)−1 AT ] b ⎤ ⎡ † vec(A) D A vec(A) ⎦. = [U V ] = [U D A V Db ] ⎣ † b Db (b)
(26)
Consequently, we obtain
r ∞ [U D A
⎡ ⎤ † D A vec(A) ⎦ ⎣ V Db ]∞ † Db (b) ∞
[U D A V Db ]∞ .
(27)
Again, as shown in Theorem 2, the error bound (27) is attainable. Therefore, we have m res =
[U D A V Db ]∞ |U |vec(|A|)+|V | |b|∞ = . r ∞ r ∞
As for the componentwise condition number cres , if ri = 0 but ri = 0, then cres = ∞. Otherwise, we have cres = [Dr† U D A Copyright q
Dr† V Db ]∞ = |Dr† ||U |vec(|A|)+|Dr† ||V ||b|∞ .
2010 John Wiley & Sons, Ltd.
Numer. Linear Algebra Appl. (2010) DOI: 10.1002/nla
D. CHU ET AL.
3. PERTURBATION ANALYSIS WITH AUGMENTED SYSTEM FOR THE TIKHONOV REGULARIZATION In this section we derive perturbation results corresponding to the componentwise errors in A and b for the Tikhonov regularization with augmented system. A similar analysis on linear least squares problem was given by Arioli et al. [23] and Bj¨orck [20]. Recalling that for the linear system of equation Ax = b with A nonsingular, the basic identity for the perturbation analysis given in Bauer [29] and Skeel [18] is x = (I + A−1 A)−1 A−1 (b −Ax),
(28)
which implies |x||(I + A−1 A)−1 | |A−1 |(|b|+| A| |x|). Hence we have, |x| (I −|A−1 || A|)−1 |A−1 |(|b|+| A| |x|) = (I +O(|A−1 || A|))|A−1 |(|b|+| A||x|), (|A−1 || A|)