A note on GMRES preconditioned by a perturbed LDLT decomposition with static pivoting
M. Arioli, I. S. Duff, S. Gratton, and S. Pralet
Toulouse, Sparse Days 2006 – p.1/30
Outline Multifrontal Static pivoting GMRES and Flexible GMRES Flexible GMRES: a roundoff error analysis GMRES right preconditioned: a roundoff error analysis Test problems Numerical experiments
Toulouse, Sparse Days 2006 – p.2/30
Linear system We wish to solve large sparse systems
Ax = b where A ∈ IRN×N is symmetric indefinite
Toulouse, Sparse Days 2006 – p.3/30
Linear system A particular and important case arises in saddle-point problems where the coefficient matrix is of the form
H
A
AT 0
Since we want accurate solutions, we would prefer to use a direct method of solution and our method of choice uses a multifrontal approach.
Toulouse, Sparse Days 2006 – p.4/30
Multifrontal method ASSEMBLY TREE
Toulouse, Sparse Days 2006 – p.5/30
Multifrontal method ASSEMBLY TREE
AT EACH NODE
F
11
T
F 12
F 12
F 22
Toulouse, Sparse Days 2006 – p.5/30
Multifrontal method ASSEMBLY TREE
AT EACH NODE
F
11
T
F 12
F 12
F 22
T −1 F22 ← F22 − F12 F11 F12
Toulouse, Sparse Days 2006 – p.5/30
Multifrontal method
From children to parent
Toulouse, Sparse Days 2006 – p.6/30
Multifrontal method
From children to parent Gather/Scatter operations (indirect addressing)
ASSEMBLY
Toulouse, Sparse Days 2006 – p.6/30
Multifrontal method
From children to parent Gather/Scatter operations (indirect addressing)
ASSEMBLY
Full Gaussian elimination, Level 3 BLAS (TRSM, GEMM)
ELIMINATION
Toulouse, Sparse Days 2006 – p.6/30
Multifrontal method
From children to parent Gather/Scatter operations (indirect addressing)
ASSEMBLY
Full Gaussian elimination, Level 3 BLAS (TRSM, GEMM)
ELIMINATION
Toulouse, Sparse Days 2006 – p.6/30
Multifrontal method
F
11
T
F 12
F 12
F 22
Pivot can only be chosen from F11 block since F22 is NOT fully summed.
Toulouse, Sparse Days 2006 – p.7/30
F 12
F 22
11
F
0
T
F 12
Multifrontal method
0
Situation wrt rest of matrix
Toulouse, Sparse Days 2006 – p.8/30
Pivoting (1 × 1) x
y
Choose x as 1 × 1 pivot if |x| > u|y| where |y| is the largest in column.
Toulouse, Sparse Days 2006 – p.9/30
Pivoting (2 × 2) x1 x2 x2 x3
y z
For the indefinite case, we can choose 2 × 2 pivot where we require " " # " # # −1 1 x1 x2 |y| u ≤ 1 x x |z| 2 3 u where again |y| and |z| are the largest in their columns.
Toulouse, Sparse Days 2006 – p.10/30
y
Pivoting
k
xk
If we assume that k − 1 pivots are chosen but |xk | < u|y| :
Toulouse, Sparse Days 2006 – p.11/30
y
Pivoting
k
xk
If we assume that k − 1 pivots are chosen but |xk | < u|y| : we can either take the RISK and use it or
Toulouse, Sparse Days 2006 – p.11/30
DELAY the pivot and
xk
y
Pivoting
k
If we assume that k − 1 pivots are chosen but |xk | < u|y| : we can either take the RISK and use it or then send to the parent a larger Schur complement.
Toulouse, Sparse Days 2006 – p.11/30
DELAY the pivot and
xk
y
Pivoting
k
If we assume that k − 1 pivots are chosen but |xk | < u|y| : we can either take the RISK and use it or then send to the parent a larger Schur complement.
This can cause more work and storage
Toulouse, Sparse Days 2006 – p.11/30
Static Pivoting An ALTERNATIVE is to use Static Pivoting, by replacing xk by xk + τ
and CONTINUE.
Toulouse, Sparse Days 2006 – p.12/30
Static Pivoting An ALTERNATIVE is to use Static Pivoting, by replacing xk by xk + τ
and CONTINUE. This is even more important in the case of parallel implementation where static data structures are often preferred
Toulouse, Sparse Days 2006 – p.12/30
Static Pivoting
Several codes use (or have an option for) this device: SuperLU (Demmel and Li) PARDISO (Gärtner and Schenk) MA57 (Duff and Pralet)
Toulouse, Sparse Days 2006 – p.13/30
Static Pivoting We thus have factorized A + E = LDLT = M
where |E| ≤ τ I
Toulouse, Sparse Days 2006 – p.14/30
Static Pivoting We thus have factorized A + E = LDLT = M
where |E| ≤ τ I The three codes then have an Iterative Refinement option. IR will converge if ρ(M −1 E) < 1
Toulouse, Sparse Days 2006 – p.14/30
Static Pivoting Choosing τ
Toulouse, Sparse Days 2006 – p.15/30
Static Pivoting Choosing τ Increase τ =⇒ increase stability of decomposition
Toulouse, Sparse Days 2006 – p.15/30
Static Pivoting Choosing τ Increase τ =⇒ increase stability of decomposition Decrease τ =⇒ better approximation of the original matrix, reduces ||E||
Toulouse, Sparse Days 2006 – p.15/30
Static Pivoting Choosing τ Increase τ =⇒ increase stability of decomposition Decrease τ =⇒ better approximation of the original matrix, reduces ||E|| Trade-off
≈ ε =⇒ big growth in preconditioning matrix M ≈ 1 =⇒ huge error ||E||.
Toulouse, Sparse Days 2006 – p.15/30
Static Pivoting Choosing τ Increase τ =⇒ increase stability of decomposition Decrease τ =⇒ better approximation of the original matrix, reduces ||E|| Trade-off
≈ ε =⇒ big growth in preconditioning matrix M ≈ 1 =⇒ huge error ||E||.
Conventional wisdom is to choose √
τ = O( ε)
Toulouse, Sparse Days 2006 – p.15/30
Static Pivoting Choosing τ Increase τ =⇒ increase stability of decomposition Decrease τ =⇒ better approximation of the original matrix, reduces ||E|| Trade-off
≈ ε =⇒ big growth in preconditioning matrix M ≈ 1 =⇒ huge error ||E||.
Conventional wisdom is to choose √
In real life ρ(M −1 E) > 1
τ = O( ε)
Toulouse, Sparse Days 2006 – p.15/30
Right preconditioned GMRES and Flexible GMRES procedure [x] = right_Prec_GMRES(A,M,b) x0 = M −1 b, r0 = b − Ax0 and β = ||r0 || v1 = r0 /β; k = 0; while ||rk || > µ(||b|| + ||A|| ||xk ||) k = k + 1; zk = M −1 vk ; w = Azk ; for i = 1, . . . , k do hi,k = viT w ; w = w − hi,k vi ; end for; hk+1,k = ||w||;
vk+1 = w/hk+1,k ; Vk = [v1 , . . . , vk ]; Hk = {hi,j }1≤i≤j+1;1≤j≤k ; yk = arg miny ||βe1 − Hk y||; xk = x0 + M −1 Vk yk and rk = b − Axk ; end while ; end procedure.
procedure [x] =FGMRES(A,M,b) x0 = M −1 b, r0 = b − Ax0 and β = ||r0 || v1 = r0 /β; k = 0; while ||rk || > µ(||b|| + ||A|| ||xk ||) k = k + 1; zk = M −1 vk ; w = Azk ; for i = 1, . . . , k do hi,k = viT w ; w = w − hi,k vi ; end for; hk+1,k = ||w||;
vk+1 = w/hk+1,k ; Zk = [z1 , . . . , zk ]; Vk = [v1 , . . . , vk ]; Hk = {hi,j }1≤i≤j+1;1≤j≤k ; yk = arg miny ||βe1 − Hk y||; xk = x0 + Zk yk and rk = b − Axk ; end while ; end procedure.
Toulouse, Sparse Days 2006 – p.16/30
Roundoff error 1 ˆ and D ˆ in floating-point arithmetic satisfy The computed L A + δA + τ E = M ˆ |D| ˆ |L ˆ T | || ||δA|| ≤ c(n)ε|| |L| ||E|| ≤ 1.
The perturbation δA must have a norm smaller than τ , in order to not dominate the global error.
Toulouse, Sparse Days 2006 – p.17/30
Roundoff error 1 ˆ and D ˆ in floating-point arithmetic satisfy The computed L A + δA + τ E = M ˆ |D| ˆ |L ˆ T | || ||δA|| ≤ c(n)ε|| |L| ||E|| ≤ 1.
The perturbation δA must have a norm smaller than τ , in order to not dominate the global error. A sufficient condition for this is
ˆ |D| ˆ |L ˆ T | || ≤ τ n ε|| |L|
Toulouse, Sparse Days 2006 – p.17/30
Roundoff error 1 ˆ and D ˆ in floating-point arithmetic satisfy The computed L A + δA + τ E = M ˆ |D| ˆ |L ˆ T | || ||δA|| ≤ c(n)ε|| |L| ||E|| ≤ 1.
The perturbation δA must have a norm smaller than τ , in order to not dominate the global error. A sufficient condition for this is ˆ |D| ˆ |L ˆ T | || ≈ || |L|
n τ
=⇒ ε ≤
ˆ |D| ˆ |L ˆ T | || ≤ τ n ε|| |L|
τ2 n2
Toulouse, Sparse Days 2006 – p.17/30
Roundoff error 1 ˆ and D ˆ in floating-point arithmetic satisfy The computed L A + δA + τ E = M ˆ |D| ˆ |L ˆ T | || ||δA|| ≤ c(n)ε|| |L| ||E|| ≤ 1.
The perturbation δA must have a norm smaller than τ , in order to not dominate the global error. A sufficient condition for this is ˆ |D| ˆ |L ˆ T | || ≈ || |L|
n τ
=⇒ ε ≤
Moreover, we assume that
ˆ |D| ˆ |L ˆ T | || ≤ τ n ε|| |L|
τ2 n2
max{||M −1 ||, ||Z¯k ||} ≤
c˜ τ
.
Toulouse, Sparse Days 2006 – p.17/30
Roundoff error 2 The roundoff error analysis of both FGMRES and GMRES can be made in four stages:
Toulouse, Sparse Days 2006 – p.18/30
Roundoff error 2 The roundoff error analysis of both FGMRES and GMRES can be made in four stages: 1. Error analysis of the Arnoldi-Krylov process (Giraud and Langou, Björck and Paige, and Paige, Rozložník, and Strakoš).
Toulouse, Sparse Days 2006 – p.18/30
Roundoff error 2 The roundoff error analysis of both FGMRES and GMRES can be made in four stages: 1. Error analysis of the Arnoldi-Krylov process (Giraud and Langou, Björck and Paige, and Paige, Rozložník, and Strakoš). 2. Error analysis of the Givens process used on the upper Hessenberg matrix Hk in order to reduce it to upper triangular form.
Toulouse, Sparse Days 2006 – p.18/30
Roundoff error 2 The roundoff error analysis of both FGMRES and GMRES can be made in four stages: 1. Error analysis of the Arnoldi-Krylov process (Giraud and Langou, Björck and Paige, and Paige, Rozložník, and Strakoš). 2. Error analysis of the Givens process used on the upper Hessenberg matrix Hk in order to reduce it to upper triangular form. 3. Error analysis of the computation of xk in FGMRES and GMRES.
Toulouse, Sparse Days 2006 – p.18/30
Roundoff error 2 The roundoff error analysis of both FGMRES and GMRES can be made in four stages: 1. Error analysis of the Arnoldi-Krylov process (Giraud and Langou, Björck and Paige, and Paige, Rozložník, and Strakoš). 2. Error analysis of the Givens process used on the upper Hessenberg matrix Hk in order to reduce it to upper triangular form. 3. Error analysis of the computation of xk in FGMRES and GMRES. 4. Use of the static pivoting properties and of A + E = LDLT in order to have the final expressions.
Toulouse, Sparse Days 2006 – p.18/30
Roundoff error 2 The roundoff error analysis of both FGMRES and GMRES can be made in four stages: 1. Error analysis of the Arnoldi-Krylov process (Giraud and Langou, Björck and Paige, and Paige, Rozložník, and Strakoš). 2. Error analysis of the Givens process used on the upper Hessenberg matrix Hk in order to reduce it to upper triangular form. 3. Error analysis of the computation of xk in FGMRES and GMRES. 4. Use of the static pivoting properties and of A + E = LDLT in order to have the final expressions. The first two stages of the roundoff error analysis are the same for both FGMRES and GMRES. the last two stages are specific to each one of the two algorithms.
Toulouse, Sparse Days 2006 – p.18/30
Roundoff error FGMRES We can prove that Flexible GMRES computes a x ¯ k s.t. ||b−A¯ xk || ≤ c(n, k)ε(||b||+||A|| ||¯ x0 ||+||A|| ||Z¯k || ||M (¯ xk −¯ x0 )||)+O(ε2 )
Toulouse, Sparse Days 2006 – p.19/30
Roundoff error FGMRES We can prove that Flexible GMRES computes a x ¯ k s.t. ||b−A¯ xk || ≤ c(n, k)ε(||b||+||A|| ||¯ x0 ||+||A|| ||Z¯k || ||M (¯ xk −¯ x0 )||)+O(ε2 )
If c(n, k)ε||A|| ||Z¯k || < 1
∀k
Toulouse, Sparse Days 2006 – p.19/30
Roundoff error FGMRES We can prove that Flexible GMRES computes a x ¯ k s.t. ||b−A¯ xk || ≤ c(n, k)ε(||b||+||A|| ||¯ x0 ||+||A|| ||Z¯k || ||M (¯ xk −¯ x0 )||)+O(ε2 )
If c(n, k)ε||A|| ||Z¯k || < 1
∀k
||b − A¯ xk || ≤ 2µε(||b|| + ||A|| (||¯ x0 || + ||¯ xk ||)) + O(ε2 ). µ=
c(n,k) ¯k || 1−c(n,k)ε||A|| ||Z
Toulouse, Sparse Days 2006 – p.19/30
Roundoff error right preconditioned GMRES As we did for FGMRES, we can prove that right preconditioned GMRES ¯k s.t. computes a x n ||b − A¯ xk || ≤ c1 (n, k)χε ||b|| + ||A|| ||x0 || + ||AM −1 || ||M || ||¯ xk − x0 || + i h ˆ |D| ˆ |L ˆ T | || × ||A|| ||Z¯k || + ||AM −1 || ||M −1 || || |L| io h xk − x0 || + ||x0 ||) + O(ε2 ). ||M (¯ xk − x0 )|| + nε ||M || (||¯
Toulouse, Sparse Days 2006 – p.20/30
Roundoff error right preconditioned GMRES As we did for FGMRES, we can prove that right preconditioned GMRES ¯k s.t. computes a x n ||b − A¯ xk || ≤ c1 (n, k)χε ||b|| + ||A|| ||x0 || + ||AM −1 || ||M || ||¯ xk − x0 || + i h ˆ |D| ˆ |L ˆ T | || × ||A|| ||Z¯k || + ||AM −1 || ||M −1 || || |L| io h xk − x0 || + ||x0 ||) + O(ε2 ). ||M (¯ xk − x0 )|| + nε ||M || (||¯ h
ˆ |D| ˆ |L ˆ T | || If ζ = c2 (n, k) ε ||A|| ||Z¯k || + ||M −1 || || |L|
i