Anda di halaman 1dari 46

Introduction to Numerical Analysis II:

Krylov Subspace Method

Xiaozhe Hu

The Pennsylvania State University

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 1 / 26


Minimization Problem
Consider Ax = b, and A is SPD.
Theorem
x is the solution of Ax = b if and only if it is the minimizer of
1
J(v ) = (Av , v ) (b, v )
2
namely
x = arg min J(v )
v Rn

Note that for any v Rn

||x v ||2A = 2J(v ) + (b, x)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 2 / 26


Richardson Method Again

Richardson method:

xk+1 = xk + (b Axk ) = xk + rk
2
we know: opt = min +max (not fessible in practice!)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 3 / 26


Richardson Method Again

Richardson method:

xk+1 = xk + (b Axk ) = xk + rk
2
we know: opt = min +max (not fessible in practice!)

Let us use the minimization problem to find the best !

min J(xk + rk )

this gives us
(rk , rk )
k =
(Ark , rk )

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 3 / 26


Steepest Descent method

Steepest descent method


Given x0 , r0 = b Ax0
For k = 0, 1, until converge
(rk ,rk )
k = (Ar k ,rk )
.
xk+1 = xk + k rk .
rk+1 = rk k Ark .
End For

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 4 / 26


Steepest Descent method

Steepest descent method


Given x0 , r0 = b Ax0
For k = 0, 1, until converge
(rk ,rk )
k = (Ar k ,rk )
.
xk+1 = xk + k rk .
rk+1 = rk k Ark .
End For

Theorem (Convergence for steepest descent method)


max (A) min (A) k
||x xk ||A ( ) ||x x0 ||A
max (A) + min (A)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 4 / 26


Conjugate Gradient Method
Idea: Instead of minimize in one direction, let us minimize in certain
subspace. When the subspace is big enough, we have the solution x
since x Rn .

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 5 / 26


Conjugate Gradient Method
Idea: Instead of minimize in one direction, let us minimize in certain
subspace. When the subspace is big enough, we have the solution x
since x Rn .

Definition (Krylov subspace)


Kk = span{r0 , Ar0 , A2 r0 , Ak1 r0 }

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 5 / 26


Conjugate Gradient Method
Idea: Instead of minimize in one direction, let us minimize in certain
subspace. When the subspace is big enough, we have the solution x
since x Rn .

Definition (Krylov subspace)


Kk = span{r0 , Ar0 , A2 r0 , Ak1 r0 }

Minimize in Krylov subspace:

xk = arg min J(v )


v x0 +Kk

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 5 / 26


Conjugate Gradient Method
Idea: Instead of minimize in one direction, let us minimize in certain
subspace. When the subspace is big enough, we have the solution x
since x Rn .

Definition (Krylov subspace)


Kk = span{r0 , Ar0 , A2 r0 , Ak1 r0 }

Minimize in Krylov subspace:

xk = arg min J(v )


v x0 +Kk

Question: Can we find a smart way to solve the above minimization


problem?

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 5 / 26


Conjugate Gradient Method

Conjugate gradient method


Given x0 , r0 = b Ax0 , p0 = r0
For k = 0, 1,
(rk ,pk )
k = (Ap k ,pk )
xk+1 = xk + k pk
rk+1 = rk k Apk
k = (Ar k+1 ,pk )
(Apk ,pk )
pk+1 = rk+1 + k pk
End For

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 6 / 26


Some Properties
Lemma
(rj , pi ) = 0, i < j;
(Api , pj ) = 0, i 6= j;
(ri , rj ) = 0, i 6= j.

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 7 / 26


Some Properties
Lemma
(rj , pi ) = 0, i < j;
(Api , pj ) = 0, i 6= j;
(ri , rj ) = 0, i 6= j.

Lemma
xk = arg min J(v ) = arg min kx v kA
v x0 +Kk v x0 +Kk

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 7 / 26


Some Properties
Lemma
(rj , pi ) = 0, i < j;
(Api , pj ) = 0, i 6= j;
(ri , rj ) = 0, i 6= j.

Lemma
xk = arg min J(v ) = arg min kx v kA
v x0 +Kk v x0 +Kk

Lemma
(rk , rk ) (rk+1 , rk+1 )
k = , k = .
(Apk , pk ) (rk , rk )

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 7 / 26


Conjugate Gradient Method

Conjugate gradient method


Given x0 , r0 = b Ax0 , p0 = r0
For k = 0, 1,
(rk ,rk )
k = (Ap k ,pk )
xk+1 = xk + k pk
rk+1 = rk k Apk
k = (rk+1 ,rk+1 )
(rk ,rk )
pk+1 = rk+1 + k pk
End For

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 8 / 26


Conjugate Gradient Method

Conjugate gradient method


Given x0 , r0 = b Ax0 , p0 = r0
For k = 0, 1,
(rk ,rk )
k = (Ap k ,pk )
xk+1 = xk + k pk
rk+1 = rk k Apk
k = (rk+1 ,rk+1 )
(rk ,rk )
pk+1 = rk+1 + k pk
End For

Only two inner products and one matrix vector multiplication are needed.

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 8 / 26


General A?

Minimize residual
xm = arg min ||b Av ||
v x0 +Km

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 9 / 26


General A?

Minimize residual
xm = arg min ||b Av ||
v x0 +Km

Algorithm
Given x0 , r0 = b Ax0 ,
For m = 1, 2 ,
xm = arg minv x0 +Km ||b Av ||
Km = span{r0 , Ar0 , , Am1 r0 }
end For

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 9 / 26


General A?

Minimize residual
xm = arg min ||b Av ||
v x0 +Km

Algorithm
Given x0 , r0 = b Ax0 ,
For m = 1, 2 ,
xm = arg minv x0 +Km ||b Av ||
Km = span{r0 , Ar0 , , Am1 r0 }
end For

How can we solve the minimization problem efficiently?

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 9 / 26


Arnoldis Process

Find orthonormal basis of Km ?


Arnoldis Process
Choose v1 = r0 / with = kr0 k;
For j = 1 : m
I wj = Avj ;
I for i = 1 : j
F hi,j = (wj , vi );
F wj = wj hi,j vi ;
I end
I hj+1,j = kwj k;
I if hj+1,j = 0, stop; else vj+1 = wj /hj+1,j .
end.

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 10 / 26


AVm = [Av1 , Av2 , , Avm ]

h1,1 h1,2 h1,3 h1,m

h2,1 h2,2 h2,3 h2,m

0 h3,2 h3,3 h3,m
= [v1 , v2 , , vm , vm+1 ]

.. .. .. .. ..

. . . . .

0 0 hm,m1 hm,m
0 0 0 hm+1,m
= Vm+1 Hm

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 11 / 26


Pm
Suppose v Km = span{v1 , v2 , , vm }, v = i=1 yi vi = Vm y , y Rm .

b Av = b AVm y = b Vm+1 Hm = Vm+1 (e1 Hm y )

Minimization problem can be equivalently written as:

xm = argminv kbAv k = argminy kVm (e1 Hm y )k = argminy ke1 Hm y k.

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 12 / 26


General Minimal Residual Method (GMRes)

General Minimal Residual Method


Compute r0 = b Ax0 , = kr0 k and v1 := r0 /;
Apply Arnoldis Process with v1 and A, get Hm ;
Compute the minimizer ym of miny ke1 Hy k;
Set xm = x0 + Vm ym ;

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 13 / 26


Find the minimizer
Transform Hm to a upper triangular matrix
 
Rm
Qm Hm = ~ ,
01m
with Qm = m m1 1 , Rm Rmm is a upper triangular matrix.

1 0 0 0 0
.. . . .. .. . . .
. . . . . ..

0 0 ci si 0
i =
,
0 0 si ci 0

.. .. .. .. . .
. . . . .
0 0 0 0 1 m+1m+1

h h
where ci = q i,i and si = q i+1,i .
2 2
hii +hi+1,i hii2 +hi+1,i
2

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 14 / 26


Find the minimizer

min ke1 Hm y k2 = min kQm (e1 + Hm y )k2


y y
 
Rm
= min kr y k2 = rm+1
2
+ min kgm Rm y k2
y 0 y

where
r = r Qm e1 = (r1 , r2 , , rm+1 )T ,
and
gm = (r1 , r2 , , rm )T .
Minimizer ym : ym = R 1 gm

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 15 / 26


Convergence for CG
Theorem (Convergence of CG)
p
(A) 1 k
||x xk ||A 2( p ) ||x x0 ||A ,
(A) + 1
where (A) is the condition number of A.

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 16 / 26


Convergence for CG
Theorem (Convergence of CG)
p
(A) 1 k
||x xk ||A 2( p ) ||x x0 ||A ,
(A) + 1
where (A) is the condition number of A.

Theorem (Modified Convergence of CG)


Assume that (A) = 0 (A) 1 (A) and l = #0 (A). Then
p
b/a 1 k
||x xk ||A 2M( p ) ||x x0 ||A ,
b/a + 1

where a = min1 (A) and b = max1 (A) . And


M = max1 (A) 0 (A) |1 .

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 16 / 26


Preconditioned Conjugate Gradient (PCG)

Idea: Apply CG on BAx = Bb with inner product (, )B 1 . B has to be


SPD.

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 17 / 26


Preconditioned Conjugate Gradient (PCG)

Idea: Apply CG on BAx = Bb with inner product (, )B 1 . B has to be


SPD.

Preconditioned Conjugate gradient method


Given x0 , r0 = b Ax0 , p0 = Br0
For k = 0, 1,
(Brk ,rk )
k = (Ap k ,pk )
xk+1 = xk + k pk
rk+1 = rk k Apk
k = (Br(Br
k+1 ,rk+1 )
k ,rk )
pk+1 = Brk+1 + k pk
End For

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 17 / 26


Sparse Matrix


10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

How to store this sparse matrix in computer?


Only store non-zeros
Have enough information to do basic operations (y = Ax)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 18 / 26


COO Format

10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

Coordinate Format (COO)

row ind: (1, 1, 2, 2, 2, 3, 3, 4, 4)

col idx: (1, 4, 1, 2, 4, 2, 3, 2, 4)

val: (10, 2, 3, 9, 3, 7, 8, 4, 2)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 19 / 26


CSR/CRS Format


10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

Compressed Row Storage Format (CRS/CSR)

col idx: (1, 4, 1, 2, 4, 2, 3, 2, 4)

val: (10, 2, 3, 9, 3, 7, 8, 4, 2)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 20 / 26


CSR/CRS Format


10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

Compressed Row Storage Format (CRS/CSR)

row ptr: (1, 3, 6, 8, 10)

col idx: (1, 4, 1, 2, 4, 2, 3, 2, 4)

val: (10, 2, 3, 9, 3, 7, 8, 4, 2)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 20 / 26


CSC/CCS Format

10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 21 / 26


CSC/CCS Format

10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

Compressed Row Storage Format (CRS/CSR)

row idx: (1, 2, 2, 3, 4, 3, 1, 2, 4)

val: (10, 3, 9, 7, 4, 8, 2, 3, 2)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 21 / 26


CSC/CCS Format

10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

Compressed Row Storage Format (CRS/CSR)

row idx: (1, 2, 2, 3, 4, 3, 1, 2, 4)

col ptr: (1, 3, 6, 7, 10)

val: (10, 3, 9, 7, 4, 8, 2, 3, 2)

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 21 / 26


CSC/CCS Format

10 0 0 2
3 9 0 3
A=
0

7 8 0
0 4 0 2

Compressed Row Storage Format (CRS/CSR)

row idx: (1, 2, 2, 3, 4, 3, 1, 2, 4)

col ptr: (1, 3, 6, 7, 10)

val: (10, 3, 9, 7, 4, 8, 2, 3, 2)

CSC of A is CSR of AT !

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 21 / 26


Matrix Vector Multiplication
X
y = Ax = aij xj
j

CSR Format

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 22 / 26


Matrix Vector Multiplication
X
y = Ax = aij xj
j

CSR Format

MatVec in CSR format


for i = 1 : n
y (i) = 0;
for j = row ptr (i) : row ptr (i + 1)

y (i) = y (i) + val(j) x(col idx(j));

end
end

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 22 / 26


Matrix Vector Multiplication

CSC Format

MatVec in CSC format


for i = 1 : n
y (i) = 0;
end
for j = 1 : n
for i = col ptr (j) : col ptr (j + 1)

y (col ptr (i)) = y (col ptr (i)) + val(i) x(j);

end
end

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 23 / 26


Preconditioners
How to choose preconditioner B?

Principle: (1) B A1 , and (2) action of B is easy to compute

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 24 / 26


Preconditioners
How to choose preconditioner B?

Principle: (1) B A1 , and (2) action of B is easy to compute

Examples:
Jacobi
GS
Any linear iterative method
Incomplete LU factorization (ILU)
Domain Decomposition Method
Multigrid Method
...

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 24 / 26


ILU

Recall LU factorization:
LU factorization
for k = 1 : n
for i = k + 1 : n
lik = aakkik
for j = k + 1 : n
aij = aij lik akj
for
for
for

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 25 / 26


ILU
Predefine a nonzero set S,
ILU factorization
for k = 1 : n
for i = k + 1 : n and (i, k) S
lik = aakkik
for j = k + 1 : n and for (i, j) S
aij = aij lik akj
for
for
for

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 26 / 26


ILU
Predefine a nonzero set S,
ILU factorization
for k = 1 : n
for i = k + 1 : n and (i, k) S
lik = aakkik
for j = k + 1 : n and for (i, j) S
aij = aij lik akj
for
for
for

How to define S?

X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 26 / 26


ILU
Predefine a nonzero set S,
ILU factorization
for k = 1 : n
for i = k + 1 : n and (i, k) S
lik = aakkik
for j = k + 1 : n and for (i, j) S
aij = aij lik akj
for
for
for

How to define S?
A simple choice: S is the nonzeros of A
Define S by position: drop those entries that far from the nonzeros of A
Define S by size: drop those small entries.
X. Hu (Penn State) MATH/CMPSC 456 Feb. 17, 2012 26 / 26

Anda mungkin juga menyukai