Frontmatter

If you are publishing this notebook on the web, you can set the parameters below to provide HTML metadata. This is useful for search engines and social media.

Advanced Topics from Scientific Computing
TU Berlin Winter 2022/23
Notebook 08
Creative Commons Lizenzvertrag Jürgen Fuhrmann

html"""
<b> Advanced Topics from Scientific Computing<br>
TU Berlin Winter 2022/23<br>
Notebook 08<br>
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons Lizenzvertrag" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /></a>
Jürgen Fuhrmann
</b> <br>
"""
3.5 ms
begin
using Graphs
using ExtendableSparse
using GraphPlot
using Colors
using LinearAlgebra
using SparseArrays
using IterativeSolvers
end
4.8 s

Iterative methods for linear systems

md"""
# Iterative methods for linear systems
"""

115 ms

Let $V=\mathbb R^n$ be equipped with the inner product $(\cdot,\cdot)$. Let $A$ be an $n\times n$ nonsingular matrix.

Solve $Au=b$ iteratively. For this purpose, two components are needed:

  • Preconditioner: a matrix $M\approx A$ "approximating" the matrix $A$ but with the property that the system $Mv=f$ is easy to solve

  • Iteration scheme: algorithmic sequence using $M$ and $A$ which updates the solution step by step

md"""
Let $V=\mathbb R^n$ be equipped with the inner product $(\cdot,\cdot)$.
Let $A$ be an $n\times n$ nonsingular matrix.
Solve $Au=b$ iteratively. For this purpose, two components are
needed:
- __Preconditioner__: a matrix $M\approx A$ "approximating" the matrix
$A$ but with the property that the system $Mv=f$ is easy to solve
- __Iteration scheme__: algorithmic sequence using $M$ and $A$ which
updates the solution step by step

"""

517 ms

Simple iteration scheme

md"""
## Simple iteration scheme
"""

6.6 ms

Assume we know the exact solution $\hat u$: $A\hat u=b$.

Then it must fulfill the identity

$$ \begin{aligned} \hat u= \hat u - M^{-1}(A \hat u -b) \end{aligned}$$

$\Rightarrow$ iterative scheme: put the "old" value on the right hand side and the ""new" value on the left hand side:

$$ \begin{aligned} u_{k+1}= u_k - M^{-1}(A u_k -b) \quad (k=0,1\dots) \end{aligned}$$

Obviously, if $u_k=\hat u$, the process would be stationary.

Otherwise it leads to a sequence of approximations

$$u_0, u_1,\dots, u_k, u_{k+1},\dots$$

md"""
Assume we know the exact solution $\hat u$: $A\hat u=b$.

Then it must fulfill the identity
```math
\begin{aligned}
\hat u= \hat u - M^{-1}(A \hat u -b)
\end{aligned}
```
$\Rightarrow$ iterative scheme: put the "old" value on the right hand
side and the ""new" value on the left hand side:
```math
\begin{aligned}
u_{k+1}= u_k - M^{-1}(A u_k -b) \quad (k=0,1\dots)
\end{aligned}
```
Obviously, if $u_k=\hat u$, the process would be stationary.

Otherwise it leads to a sequence of approximations

```math
u_0, u_1,\dots, u_k, u_{k+1},\dots
```
"""

370 μs

Implementation: solve $Au=b$ with tolerance $\varepsilon$:

  1. Choose initial value $u_0$, set $k=0$

  2. Calculate residuum $r_k=Au_k -b$

  3. Test convergence: if $||r_k||<\varepsilon$ set $u=u_k$, finish

  4. Calculate update: solve $M v_k = r_k$

  5. Update solution: $u_{k+1}=u_k - v_k$, set $k=k+1$, repeat with step 2.

md"""
Implementation: solve $Au=b$ with tolerance $\varepsilon$:
1. Choose initial value $u_0$, set $k=0$
2. Calculate _residuum_ $r_k=Au_k -b$
3. Test convergence: if $||r_k||<\varepsilon$ set $u=u_k$, finish
4. Calculate _update_: solve $M v_k = r_k$
5. Update solution: $u_{k+1}=u_k - v_k$, set $k=k+1$, repeat with step 2.

"""

173 ms

General convergence theorem

md"""
### General convergence theorem
"""

5.6 ms

Let $\hat u$ be the solution of $Au=b$.

Let $e_k= u_k - \hat u$ be the error of the $k$-th iteration step. Then:

$$ \begin{aligned} u_{k+1}&= u_k - M^{-1}(A u_k -b) \\ & = (I-M^{-1}A) u_k + M^{-1}b\\ u_{k+1}-\hat u &= u_k -\hat u - M^{-1}(A u_k - A \hat u)\\ &=(I-M^{-1}A)( u_k - \hat u)\\ &=(I-M^{-1}A)^k( u_0 - \hat u) \end{aligned}$$

resulting in

$$ \begin{aligned} e_{k+1}= (I-M^{-1}A)^k e_0 \end{aligned}$$

  • So when does $(I-M^{-1}A)^k$ converge to zero for $k\to\infty$ ?

  • Denote $B=I-M^{-1}A$

md"""
Let $\hat u$ be the solution of $Au=b$.

Let $e_k= u_k - \hat u$ be the error of the $k$-th iteration step. Then:
```math
\begin{aligned}
u_{k+1}&= u_k - M^{-1}(A u_k -b) \\
& = (I-M^{-1}A) u_k + M^{-1}b\\
u_{k+1}-\hat u &= u_k -\hat u - M^{-1}(A u_k - A \hat u)\\
&=(I-M^{-1}A)( u_k - \hat u)\\
&=(I-M^{-1}A)^k( u_0 - \hat u)
\end{aligned}
```
resulting in
```math
\begin{aligned}
e_{k+1}= (I-M^{-1}A)^k e_0
\end{aligned}
```
- So when does $(I-M^{-1}A)^k$ converge to zero for $k\to\infty$ ?
- Denote $B=I-M^{-1}A$
"""

3.1 ms

Definition The spectral radius $\rho(B)$ is the largest absolute value of any eigenvalue of $B$: $\rho(B)= \max_{\lambda\in\sigma(B)} |\lambda|$.

definition"""
__Definition__
The spectral radius $\rho(B)$ is the largest absolute
value of any eigenvalue of $B$:
$\rho(B)= \max_{\lambda\in\sigma(B)} |\lambda|$.
"""

10.9 ms

Sufficient condition for iterative method convergence:

$$\rho(I-M^{-1}A)<1$$

important"""
__Sufficient condition for iterative method convergence:__
```math
\rho(I-M^{-1}A)<1
```
"""

73.7 μs

Asymptotic convergence factor $\rho_{it}$ can be estimated via the spectral radius:

$$ \begin{aligned} \rho_{it}&= \lim_{k\to\infty}\left(\max_{u_0}\frac{|| (I-M^{-1}A)^k(u_0-\hat u)||}{||u_0-\hat u||}\right)^{\frac1k}\\ &= \lim_{k\to\infty}||(I-M^{-1}A)^k||^{\frac1k}\\ &=\rho(I-M^{-1}A) \end{aligned}$$

Depending on $u_0$, the rate may be faster, though

important"""
__Asymptotic convergence factor__ ``\rho_{it}`` can be estimated via the spectral
radius:
```math
\begin{aligned}
\rho_{it}&= \lim_{k\to\infty}\left(\max_{u_0}\frac{|| (I-M^{-1}A)^k(u_0-\hat u)||}{||u_0-\hat u||}\right)^{\frac1k}\\
&= \lim_{k\to\infty}||(I-M^{-1}A)^k||^{\frac1k}\\
&=\rho(I-M^{-1}A)
\end{aligned}
```
Depending on ``u_0``, the rate may be faster, though

"""
2.4 ms

Convergence estimate for symmetric positive definite A,M

md"""
### Convergence estimate for symmetric positive definite A,M
"""
152 μs

Matrix preconditioned Richardson iteration: $M$, $A$ spd.

Scaled Richardson iteration with preconditioner $M$

$$ u_{k+1}= u_k- \alpha M^{-1}(Au_k -b)$$

Spectral equivalence estimate

$$ 0<\gamma_{min}(Mu,u) \leq (Au,u) \leq \gamma_{max}(Mu,u)$$

$\Rightarrow$ $\gamma_{min}\leq \lambda_i \leq \gamma_{max}$

$\Rightarrow$ optimal parameter $\alpha=\frac{2}{\gamma_{max}+\gamma_{min}}$

$\Rightarrow$ convergence rate with optimal parameter: $\rho_{opt}\leq\frac{\kappa(M^{-1}A)-1}{\kappa(M^{-1}A)+1}$ where $\kappa(M^{-1}A)\leq \frac{\gamma_{max}}{\gamma_{min}}$

statement"""
__Matrix preconditioned Richardson iteration:__ ``M``, ``A`` spd.

Scaled Richardson iteration with preconditioner ``M``
```math
u_{k+1}= u_k- \alpha M^{-1}(Au_k -b)
```

Spectral equivalence estimate
```math
0<\gamma_{min}(Mu,u) \leq (Au,u) \leq \gamma_{max}(Mu,u)
```

``\Rightarrow`` ``\gamma_{min}\leq \lambda_i \leq \gamma_{max}``

``\Rightarrow`` optimal parameter ``\alpha=\frac{2}{\gamma_{max}+\gamma_{min}}``

``\Rightarrow`` convergence rate with optimal parameter: ``\rho_{opt}\leq\frac{\kappa(M^{-1}A)-1}{\kappa(M^{-1}A)+1}`` where ``\kappa(M^{-1}A)\leq \frac{\gamma_{max}}{\gamma_{min}}``


"""

169 μs

Regular splittings

md"""
### Regular splittings
"""
143 μs

Definiton

  • $A=M-N$ is a regular splitting if

    • $M$ is nonsingular

    • $M^{-1}\geq 0$, $N\geq 0$ are element-wise nonnegative

definition"""
__Definiton__
-
``A=M-N`` is a regular splitting if

-
``M`` is nonsingular
-
``M^{-1}\geq 0``, ``N\geq 0`` are element-wise nonnegative
"""

133 μs

Just remark that in this case $M^{-1}N=I-M^{-1}A$, and that we don't assume symmetry.

md"""
Just remark that in this case ``M^{-1}N=I-M^{-1}A``, and that we don't assume symmetry.
"""
94.1 ms

Theorem: Assume $A$ is nonsingular, $A^{-1}\geq 0$, and $A=M-N$ is a regular splitting. Then $\rho(M^{-1}N)<1$.

statement"""
__Theorem__: Assume ``A`` is nonsingular, ``A^{-1}\geq 0``, and
``A=M-N`` is a regular splitting. Then ``\rho(M^{-1}N)<1``.

"""

85.0 μs

With this theory we cannot say much about the value of the convergence rate, but we have a comparison theorem:

md"""
With this theory we cannot say much about the value of the convergence rate, but we have a comparison theorem:
"""
169 μs

Theorem: Let $A^{-1}\geq 0$, $A=M_1-N_1$ and $A=M_2-N_2$ be regular splittings.

If $N_2\geq N_1$, then $1 >\rho(M_2^{-1}N_2) \geq \rho(M_1^{-1}N_1)$.

statement"""
__Theorem:__ Let ``A^{-1}\geq 0``, ``A=M_1-N_1`` and ``A=M_2-N_2`` be
regular splittings.


If ``N_2\geq N_1``, then
``1 >\rho(M_2^{-1}N_2) \geq \rho(M_1^{-1}N_1)``.

"""
87.4 μs

What can we say about inverse nonnegative matrices ?

md"""
What can we say about inverse nonnegative matrices ?
"""
144 μs

Definition Let $A$ be an $n\times n$ real matrix. $A$ is called M-Matrix if

  • (i) $a_{ij}\leq 0$ for $i\neq j$

  • (ii) $A$ is nonsingular

  • (iii) $A^{-1}\geq 0$

definition"""
__Definition__ Let ``A`` be an ``n\times n `` real matrix.
``A`` is called M-Matrix if
- (i) ``a_{ij}\leq 0`` for ``i\neq j``
- (ii) ``A`` is nonsingular
- (iii) ``A^{-1}\geq 0``

"""
199 μs

Definition A square matrix $A$ is reducible if there exists a permutation matrix $P$ (re-ordering of equations) such that

$\begin{aligned} PAP^T =\begin{pmatrix} A_{11} & A_{12}\\ 0 & A_{22} \end{pmatrix} \end{aligned}$

$A$ is irreducible if it is not reducible.

definition"""
__Definition__ A square matrix $A$ is _reducible_ if there exists a
permutation matrix $P$ (re-ordering of equations) such that

$\begin{aligned}
PAP^T
=\begin{pmatrix}
A_{11} & A_{12}\\
0 & A_{22}
\end{pmatrix}
\end{aligned}$

$A$ is _irreducible_ if it is not reducible.


"""

105 μs

An M-Matrix $A$ is inverse positive, i.e. $A^{-1}>0$ if and only if it is irreducible

statement"""
An M-Matrix ``A`` is inverse positive, i.e. ``A^{-1}>0`` if and only if it is irreducible
"""
86.0 μs

Irreducibility is easy to check.

md"""
Irreducibility is easy to check.
"""
152 μs

Define a directed graph from the nonzero entries of a $n \times n$ matrix $A=(a_{ik})$:

  • Nodes: $\mathcal N= \{N_i\}_{i=1\dots n}$

  • Directed edges: $\mathcal E= \{ \overrightarrow{N_k N_l}| a_{kl}\neq 0\}$

  • Matrix entries $\equiv$ weights of directed edges

$\Rightarrow$ 1:1 equivalence between matrices and weighted directed graphs

md"""
Define a directed graph from the nonzero entries of a ``n \times n`` matrix $A=(a_{ik})$:
- Nodes: $\mathcal N= \{N_i\}_{i=1\dots n}$
- Directed edges: $\mathcal E= \{ \overrightarrow{N_k N_l}| a_{kl}\neq 0\}$
- Matrix entries $\equiv$ weights of directed edges
``\Rightarrow`` 1:1 equivalence between matrices and weighted directed graphs
"""

440 μs

Theorem : $A$ is irreducible $\Leftrightarrow$ the matrix graph is strongly connected, i.e. for each ordered pair $(N_i,N_j)$ there is a path consisting of directed edges, connecting them.

statement"""
__Theorem__ :
$A$ is irreducible $\Leftrightarrow$ the matrix graph is strongly connected,
i.e. for each _ordered_ pair $(N_i,N_j)$ there is a path consisting
of directed edges, connecting them.
"""

2.2 ms

Create a bidirectional graph (digraph) from a matrix in Julia. Create edge labels from off-diagonal entries and node labels combined from diagonal entries and node indices.

md"""
Create a bidirectional graph (digraph) from a matrix in Julia. Create edge labels from off-diagonal entries and node labels combined from diagonal entries and node indices.
"""
181 μs
function create_graph(matrix)
@assert size(matrix,1)==size(matrix,2)
n=size(matrix,1)
g=Graphs.SimpleDiGraph(n)
elabel=[]
nlabel=Any[]
for i in 1:n
push!(nlabel,"""$(i) \n $(round(matrix[i,i],sigdigits=3))""")
for j in 1:n
if i!=j && matrix[i,j]!=0
add_edge!(g,i,j)
push!(elabel,round(matrix[i,j],sigdigits=3))
end
end
end
g,nlabel,elabel
end;
2.1 ms

Use ExtendableSparse.fdrand to create test matrices like the heatmatrix in the previous lecture:

md"""
Use `ExtendableSparse.fdrand` to create test matrices like the heatmatrix in the previous lecture:
"""
215 μs
fdrand(, nx)
fdrand(, nx, ny)
fdrand(, nx, ny, nz; matrixtype, update, rand, symmetric)
fdrand(nx)

Create matrix for a mock finite difference operator for a diffusion problem with random coefficients on a unit hypercube $\Omega\subset\mathbb R^d$. with $d=1$ if nx>1 && ny==1 && nz==1, $d=2$ if nx>1 && ny>1 && nz==1 and $d=3$ if nx>1 && ny>1 && nz>1 . In the symmetric case it corresponds to

$$ \begin{align*} -\nabla a \nabla u &= f&& \text{in}\; \Omega \\ a\nabla u\cdot \vec n + bu &=g && \text{on}\; \partial\Omega \end{align*}$$

The matrix is irreducibly diagonally dominant, has positive main diagonal entries and nonpositive off-diagonal entries, hence it has the M-Property. Therefore, its inverse will be a dense matrix with positive entries, and the spectral radius of the Jacobi iteration matrix $ho(I-D(A)^{-1}A)<1$ .

Moreover, in the symmetric case, it is positive definite.

Parameters+ default values:

Parameter + default valeDescription
nxNumber of unknowns in x direction
nyNumber of unknowns in y direction
nzNumber of unknowns in z direction
matrixtype = SparseMatrixCSCMatrix type
update = (A,v,i,j)-> A[i,j]+=vElement update function
rand =()-> rand()Random number generator
symmetric=trueWhether to create symmetric matrix or not

The sparsity structure is fixed to an orthogonal grid, resulting in a 3, 5 or 7 diagonal matrix depending on dimension. The entries are random unless e.g. rand=()->1 is passed as random number generator. Tested for Matrix, SparseMatrixCSC, ExtendableSparseMatrix, Tridiagonal, SparseMatrixLNK and :COO

@doc fdrand
912 ms
A2
25×25 SparseMatrixCSC{Float64, Int64} with 105 stored entries:
⠻⣦⡈⠢⡀⠀⠀⠀⠀⠀⠀⠀⠀
⠢⡈⠱⣦⡈⠢⡀⠀⠀⠀⠀⠀⠀
⠀⠈⠢⡈⠛⣤⡈⠢⡀⠀⠀⠀⠀
⠀⠀⠀⠈⠢⡈⠻⢆⡈⠢⡀⠀⠀
⠀⠀⠀⠀⠀⠈⠢⡈⠻⣦⠈⠢⡀
⠀⠀⠀⠀⠀⠀⠀⠈⠢⡀⠻⣦⡀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠀⠈⠁
A2=fdrand(5,5,symmetric=false)
36.0 μs
graph2,nlabel2,elabel2=create_graph(A2);
33.9 μs
-0.264 -0.982 -0.0569 -0.103 -0.093 -0.217 -0.223 -0.0873 -0.262 -0.477 -0.509 -0.408 -0.382 -0.501 -0.131 -0.277 -0.36 -0.214 -0.753 -0.508 -0.0774 -0.361 -0.342 -0.23 -0.742 -0.66 -0.632 -0.33 -0.965 -0.936 -1.09 -0.471 -0.477 -0.431 -0.394 -0.356 -0.318 -0.215 -0.421 -0.359 -0.296 -0.154 -0.0627 -0.0594 -0.0458 -0.0498 -1.01 -0.332 -0.28 -0.454 -0.349 -0.472 -0.109 -0.125 -0.0656 -0.277 -0.173 -0.166 -0.339 -0.547 -0.146 -0.177 -0.0778 -0.408 -0.161 -0.111 -0.152 -0.429 -0.226 -0.607 -0.195 -0.693 -0.554 -0.808 -0.221 -0.458 -0.128 -0.246 -0.0771 -0.159 1 1.53 2 0.29 3 0.605 4 1.32 5 0.883 6 0.955 7 1.84 8 1.01 9 2.36 10 3.17 11 1.53 12 1.28 13 1.23 14 0.218 15 1.77 16 1.31 17 0.575 18 1.23 19 0.809 20 0.525 21 0.803 22 1.61 23 1.66 24 0.861 25 0.304
let
GraphPlot.gplot(graph2;
nodelabel=nlabel2,
edgelabel=elabel2,
nodefillc=RGBA(1.0,0.6,0.5,0.3),
EDGELABELSIZE=4,
NODESIZE=0.1,
edgestrokec=RGB(0.5,1.0,0.5),
EDGELINEWIDTH=0.5,
arrowlengthfrac=0.05
)
end
575 μs

Let $A=(a_{ij})$ be an $n\times n$ matrix.

  • $A$ is diagonally dominant if for $i=1\dots n$, $\displaystyle |a_{ii}|\geq \sum_{\substack{j=1\dots n\\ j\neq i}} |a_{ij}|$

  • $A$ is strictly diagonally dominant (sdd) if for $i=1\dots n$, $\displaystyle |a_{ii}|>\sum_{\substack{j=1\dots n\\ j\neq i}} |a_{ij}|$

  • $A$ is irreducibly diagonally dominant (idd) if

    1. $A$ is irreducible

    2. $A$ is diagonally dominant: for $i=1\dots n$, $\displaystyle |a_{ii}|\geq\sum_{\substack{j=1\dots n\\ j\neq i}} |a_{ij}|$

    3. for at least one $r$, $1 \leq r \leq n$, $\displaystyle |a_{rr}|>\sum_{\substack{j=1\dots n\\ j\neq r}} |a_{rj}|$

definition"""
Let ``A=(a_{ij})`` be an ``n\times n`` matrix.
-
``A`` is _diagonally dominant_ if for ``i=1\dots n``, ``\displaystyle |a_{ii}|\geq \sum_{\substack{j=1\dots n\\ j\neq i}} |a_{ij}|``
-
``A`` is _strictly diagonally dominant_ (sdd) if for ``i=1\dots n``, ``\displaystyle |a_{ii}|>\sum_{\substack{j=1\dots n\\ j\neq i}} |a_{ij}|``
- ``A`` is _irreducibly diagonally dominant_ (idd) if
1. ``A`` is irreducible
2. ``A`` is diagonally dominant: for ``i=1\dots n``, ``\displaystyle |a_{ii}|\geq\sum_{\substack{j=1\dots n\\ j\neq i}} |a_{ij}|``
3. for at least one ``r``, ``1 \leq r \leq n``, ``\displaystyle |a_{rr}|>\sum_{\substack{j=1\dots n\\ j\neq r}} |a_{rj}|``
"""
221 μs
rowdiff (generic function with 1 method)
function rowdiff(A)
[abs(A[i,i])-sum(abs,A[i,1:i-1])-sum(abs,A[i,i+1:end]) for i=1:size(A,1) ]
end
826 μs
extrema(rowdiff(A2))
12.4 μs
using Tables
26.9 ms
Column1Column2Column3Column4Column5Column6Column7Column8more
Float64Float64Float64Float64Float64Float64Float64Float64
1
1.52685
-0.264271
0.0
0.0
0.0
-0.982148
0.0
0.0
2
-0.0569214
0.289605
-0.103476
0.0
0.0
0.0
-0.0929528
0.0
3
0.0
-0.216704
0.605017
-0.223252
0.0
0.0
0.0
-0.0873248
4
0.0
0.0
-0.262253
1.31654
-0.477198
0.0
0.0
0.0
5
0.0
0.0
0.0
-0.407609
0.883214
0.0
0.0
0.0
6
-0.501379
0.0
0.0
0.0
0.0
0.954898
-0.130856
0.0
7
0.0
-0.360187
0.0
0.0
0.0
-0.213942
1.83561
-0.753073
8
0.0
0.0
-0.0774368
0.0
0.0
0.0
-0.360917
1.01046
9
0.0
0.0
0.0
-0.742497
0.0
0.0
0.0
-0.659688
10
0.0
0.0
0.0
0.0
-0.964817
0.0
0.0
0.0
more
Tables.table(A2)
98.1 ms

Given some matrix, we now have some nice recipies to establish nonsingularity and iterative method convergence:

  • Check if the matrix is irreducible.

    • This is mostly the case for elliptic and parabolic PDEs and can be done by checking the graph of the matrix

  • Check if the matrix is strictly or irreducibly diagonally dominant.

    • If yes, it is in addition nonsingular.

  • Check if main diagonal entries are positive and off-diagonal entries are nonpositive.

    • If yes, in addition, the matrix is an M-Matrix, its inverse is nonnegative, and elementary iterative methods based on regular splittings converge.

These critera do not depend on the symmetry of the matrix!

important"""
Given some matrix, we now have some nice recipies to establish
nonsingularity and iterative method convergence:
-
__Check if the matrix is irreducible.__
- This is mostly the case for elliptic and parabolic PDEs and can be done by checking the graph of the matrix
-
__Check if the matrix is strictly or irreducibly diagonally
dominant__.
- If yes, it is in addition nonsingular.
-
__Check if main diagonal entries are positive and off-diagonal entries are nonpositive.__
- If yes, in addition, the matrix is an M-Matrix, its inverse is nonnegative, and elementary iterative methods based on regular splittings converge.

These critera do not depend on the symmetry of the matrix!

"""

19.5 ms

Preconditioners

md"""
## Preconditioners
"""
146 μs

Jacobi preconditioner

md"""
### Jacobi preconditioner
"""
135 μs

Jacobi method: M=D, the diagonal of A

md"""
__Jacobi method__: M=D, the diagonal of A
"""
186 μs

Theorem: If A is an M-Matrix, then the Jacobi preconditioner leads to a regular splitting.

statement"""
__Theorem__: If A is an M-Matrix, then the Jacobi preconditioner leads to a regular splitting.
"""
80.5 μs

Incomplete LU factorization

md"""
### Incomplete LU factorization
"""
137 μs

Idea (Varga, Buleev, $≈1960$ : derive a preconditioner not from an additive decomposition but from the LU factorization.

  • LU factorization has large fill-in. For a preconditioner, just limit the fill-in to a fixed pattern.

  • Apply the standard LU factorization method, but calculate only a part of the entries, e.g. only those which are larger than a certain threshold value, or only those which correspond to certain predefined pattern.

  • Result: incomplete LU factors $L$, $U$, remainder $R$: $A=LU-R$

  • What about zero pivots which prevent such an algoritm from being computable ?

md"""
Idea (Varga, Buleev, $≈1960$ : derive a preconditioner not from an additive decomposition but from the LU factorization.

- LU factorization has large fill-in. For a preconditioner, just limit the fill-in to a fixed pattern.
- Apply the standard LU factorization method, but calculate only a part of the entries, e.g. only those which are larger than a certain threshold value, or only those which correspond to certain predefined pattern.
- Result: incomplete LU factors $L$, $U$, remainder $R$: $A=LU-R$

- What about zero pivots which prevent such an algoritm from being computable ?

"""
438 μs

Theorem (Saad, Th. 10.2): If $A$ is an M-Matrix, then the algorithm to compute the incomplete LU factorization with a given pattern is stable, i.e. does not detriorate due to zero pivots (main diagonal elements) Moreover, $A=LU-R=M-N$ where $M=LU$ and $N=R$ is a regular splitting.

statement"""
__Theorem__ (Saad, Th. 10.2): If $A$ is an M-Matrix, then the
algorithm to compute the incomplete LU factorization with a given pattern
is stable, i.e. does not detriorate due to zero pivots (main diagonal elements)
Moreover, $A=LU-R=M-N$ where $M=LU$ and $N=R$ is a regular splitting.

"""

115 μs
  • Generally better convergence properties than Jacobi, though we cannot apply the comparison theorem for regular splittings to cpmpare between them

  • Block variants are possible

  • ILU Variants:

    • ILUM: ("modified"): add ignored off-diagonal entries to main diagonal

    • ILUT: ("threshold"): zero pattern calculated dynamically based on drop tolerance

    • ILU0: Drop all fill-in

    • Incomplete Cholesky: symmetric variant of ILU

  • Dependence on ordering

  • Can be parallelized using graph coloring

  • Not much theory: experiment for particular systems and see if it works well

  • I recommend it as the default initial guess for a sensible preconditioner

md"""
- Generally better convergence properties than Jacobi, though we cannot apply the comparison theorem for regular splittings to cpmpare between them
- Block variants are possible
- ILU Variants:
- ILUM: ("modified"): add ignored off-diagonal entries to main diagonal
- ILUT: ("threshold"): zero pattern calculated dynamically based on drop tolerance
- ILU0: Drop all fill-in
- Incomplete Cholesky: symmetric variant of ILU
- Dependence on ordering
- Can be parallelized using graph coloring
- Not much theory: experiment for particular systems and see if it works well
- I recommend it as the default initial guess for a sensible preconditioner

"""
646 μs

Further preconditioners

  • Multigrid methods

  • Domain decomposition

  • Block variants of Jacobi, ILU...

md"""
### Further preconditioners
- Multigrid methods
- Domain decomposition
- Block variants of Jacobi, ILU...
"""
298 μs

Krylov subspace methods

md"""
## Krylov subspace methods
"""
135 μs
md"""
- So far we considered simple iterative schemes, perhaps with preconditioners
- Krylov subspace methods are more sophisticateand and in many cases yield faster convergence than simple iterative schemes
- Reading material:
- M. Gutknecht [A Brief Introduction to Krylov Space Methods for Solving Linear Systems](http://www.sam.math.ethz.ch/~mhg/pub/biksm.pdf)
- J. Shewchuk [Introduction to the Conjugate Gradient Method Without the Agonizing Pain](http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf)
- E.Carson, J.Liesen, Z. Strakoš: [70 years of Krylov subspace methods: The journey continues](https://arxiv.org/abs/2211.00953)

"""

39.4 ms

Definition: Let $A\in \mathbb{R}^{N\times N}$ be nonsingular, let $0\neq y \in \mathbb{R}^n$. The $k$-th Krylov subspace generated from $A$ by $y$ is defined as $\mathcal{K}_k(A,y)=\mathrm{span}\{y,Ay,\dots, A^{k-1}y\}.$

definition"""
__Definition:__
Let ``A\in \mathbb{R}^{N\times N}`` be nonsingular,
let ``0\neq y \in \mathbb{R}^n``. The ``k``-th _Krylov subspace_ generated from
``A`` by ``y`` is defined as
``\mathcal{K}_k(A,y)=\mathrm{span}\{y,Ay,\dots, A^{k-1}y\}.``

"""

88.5 μs

Definition: Let $A\in \mathbb{R}^{N\times N}$ be nonsingular, let $0\neq y \in \mathbb{R}^N$. An iterative method such that

$$ \begin{aligned} u_k&= u_0+q_{k-1}(A)r_0 \in \mathcal{K}_k(A,r_0) \end{aligned}$$

where $q_{k-1}$ is a polynomial of degree $k$ is called Krylov subspace method.

definition"""
__Definition:__ Let ``A\in \mathbb{R}^{N\times N}`` be nonsingular,
let ``0\neq y \in \mathbb{R}^N``. An iterative method such that
```math
\begin{aligned}
u_k&= u_0+q_{k-1}(A)r_0 \in \mathcal{K}_k(A,r_0)
\end{aligned}
```
where ``q_{k-1}`` is a polynomial of degree ``k`` is called _Krylov subspace method_.

"""

90.7 μs

The idea of the GMRES method

md"""
### The idea of the GMRES method
"""
147 μs

Search the new iterate

$$ \begin{aligned} u_k&= u_0+q_{k-1}(A)r_0 \in \mathcal{K}_k(A,r_0) \end{aligned}$$

such that $r_k= ||Au_k -b||$ is minimized. This results in the Generalized Minimum Residual (GMRES) method.

definition"""
Search the new iterate
```math
\begin{aligned}
u_k&= u_0+q_{k-1}(A)r_0 \in \mathcal{K}_k(A,r_0)
\end{aligned}
```
such that `` r_k= ||Au_k -b||`` is minimized. This results in the _Generalized Minimum Residual_ (GMRES) method.
"""
2.7 ms
  • In order to find a good solution of this problem, we need to find an orthogonal basis of $\mathcal{K}_k$ $\Rightarrow$ run an orthogonalization algorithm at each step

  • One needs to store at least $k$ vectors simultaneously $\Rightarrow$ usually, the iteration is restarted after a fixed number of iteration steps to keep the dimension of $\mathcal{K}_k$ limited

  • There are preconditioned variants

  • For symmetric matrices, one gets short three-term recursions, and there is no need to store a full Krylov basis. This results in the MINRES method

  • Choosing $q_k$ such that we get short recursions always will sacrifice some of the convergence estimates for GMRES. Nevertheless, this appraoch is tried quite often, resuling in particular in the BiCGstab and CGS methods.

md"""
- In order to find a good solution of this problem, we need to find an orthogonal basis of ``\mathcal{K}_k`` ``\Rightarrow`` run an orthogonalization algorithm at each step
- One needs to store at least ``k`` vectors simultaneously ``\Rightarrow`` usually, the iteration is restarted after a fixed number of iteration steps to keep the dimension of ``\mathcal{K}_k`` limited
- There are preconditioned variants
- For symmetric matrices, one gets short three-term recursions, and there is no need to store a full Krylov basis. This results in the MINRES method
- Choosing ``q_k`` such that we get short recursions always will sacrifice some of the convergence estimates for GMRES. Nevertheless, this appraoch is tried quite often, resuling in particular in the BiCGstab and CGS methods.
"""

432 μs

Conjugated Gradients

md"""
### Conjugated Gradients
"""
138 μs

This method assumes that the A and M are symmetric, positive definite.

md"""
This method assumes that the A and M are symmetric, positive definite.
"""
154 μs

$$\begin{aligned} r_0&=b-Au_0\\ d_0&=M^{-1} r_0\\ \alpha_i&=\frac{(M^{-1}r_i,r_i)}{(Ad_i, d_i)}\\ u_{i+1}&=u_i+\alpha_i d_i\\ r_{i+1}&=r_i-\alpha_i Ad_i\\ \beta_{i+1}&=\frac{(M^{-1}r_{i+1},r_{i+1})}{(r_{i},r_{i})}\\ d_{i+1}&=M^{-1}r_{i+1}+\beta_{i+1}d_i \end{aligned}$$

md"""
```math
\begin{aligned}
r_0&=b-Au_0\\
d_0&=M^{-1} r_0\\
\alpha_i&=\frac{(M^{-1}r_i,r_i)}{(Ad_i, d_i)}\\
u_{i+1}&=u_i+\alpha_i d_i\\
r_{i+1}&=r_i-\alpha_i Ad_i\\
\beta_{i+1}&=\frac{(M^{-1}r_{i+1},r_{i+1})}{(r_{i},r_{i})}\\
d_{i+1}&=M^{-1}r_{i+1}+\beta_{i+1}d_i
\end{aligned}
```

"""
111 μs

The convergence rate (error reduction in a norm defined by M and A) can be estimated via $\rho_{CG}=2\frac{\sqrt\kappa-1}{\sqrt\kappa +1}$ where $\kappa=\kappa(M^{-1}A)$. In fact, the distribution of the eigenvalues is important for convergence as well.

CG is a Krylov subspace method as well.

md"""
The convergence rate (error reduction in a norm defined by M and A) can be estimated via ``\rho_{CG}=2\frac{\sqrt\kappa-1}{\sqrt\kappa +1}`` where ``\kappa=\kappa(M^{-1}A)``. In fact, the distribution of the eigenvalues is important for convergence as well.

CG is a Krylov subspace method as well.
"""
211 μs

Complexity estimates

md"""
## Complexity estimates
"""
167 μs

Solve linear system iteratively, for the error norm, assume $e_k\leq \rho^k e_0$. Iterate until $e_k\leq\epsilon$. Estimate the necessary number of iteration steps:

$$ \begin{aligned} \rho^ke_0&\leq \epsilon\\ k\ln \rho &< \ln\epsilon -\ln e_0\\ k \geq k_{\rho} &=\left\lceil\frac{\ln e_0-\ln\epsilon}{\ln \rho}\right\rceil \end{aligned}$$

$\Rightarrow$ we need at least $k_\rho$ iteration steps to reach accuracy $\epsilon$

md"""
Solve linear system iteratively, for the error norm, assume ``e_k\leq \rho^k e_0``.
Iterate until ``e_k\leq\epsilon``.
Estimate the necessary number of iteration steps:
```math
\begin{aligned}
\rho^ke_0&\leq \epsilon\\
k\ln \rho &< \ln\epsilon -\ln e_0\\
k \geq k_{\rho} &=\left\lceil\frac{\ln e_0-\ln\epsilon}{\ln \rho}\right\rceil
\end{aligned}
```
``\Rightarrow`` we need at least ``k_\rho`` iteration steps to reach accuracy ``\epsilon``
"""

226 μs

The ideal iterative solver:

  • $\rho<\rho_0 <1$ independent of $h$ resp. $N$ $\Rightarrow$ $k_\rho$ independent of $N$.

  • $A$ sparse $\Rightarrow$ matrix-vector multiplication $A u$ has complexity $O(N)$

  • Solution of $Mv=r$ has complexity $O(N)$.

$\Rightarrow$ Number of iteration steps $k_\rho$ independent of $N$ Each iteration step has complexity $O(N)$ $\Rightarrow$ Overall complexity $O(N)$

definition"""
The __ideal iterative solver__:
- ``\rho<\rho_0 <1`` independent of ``h`` resp. ``N`` ``\Rightarrow`` ``k_\rho`` independent
of ``N``.
- ``A`` sparse ``\Rightarrow`` matrix-vector multiplication ``A u`` has complexity ``O(N)``
- Solution of ``Mv=r`` has complexity ``O(N)``.

``\Rightarrow`` Number of iteration steps ``k_\rho`` independent of ``N``
Each iteration step has complexity ``O(N)``
``\Rightarrow`` Overall complexity ``O(N)``

"""

2.6 ms

Typical situation with second order PDEs and e.g. Jacobi or ILU preconditioners:

$$ \begin{aligned} \kappa(M^{-1}A)&=O(h^{-2})\quad (h\to 0)\\ \rho(I-M^{-1}A) &\leq\frac{\kappa(M^{-1}A)-1}{\kappa(M^{-1}A)+1}\approx 1-O(h^{2})\quad (h\to 0)\\ \rho_{CG}(I-M^{-1}A) &\leq\frac{\sqrt{\kappa(M^{-1}A)}-1}{\sqrt{\kappa(M^{-1}A)}+1}\approx 1-O(h)\quad (h\to 0) \end{aligned}$$

  • Mean square error of approximation $||u-u_h||_2<h^\gamma$, in the simplest case $\gamma=2$.

important"""
Typical situation with second order PDEs and e.g. __Jacobi or ILU preconditioners:__
```math
\begin{aligned}
\kappa(M^{-1}A)&=O(h^{-2})\quad (h\to 0)\\
\rho(I-M^{-1}A) &\leq\frac{\kappa(M^{-1}A)-1}{\kappa(M^{-1}A)+1}\approx 1-O(h^{2})\quad (h\to 0)\\
\rho_{CG}(I-M^{-1}A) &\leq\frac{\sqrt{\kappa(M^{-1}A)}-1}{\sqrt{\kappa(M^{-1}A)}+1}\approx 1-O(h)\quad (h\to 0)
\end{aligned}
```
- Mean square error of approximation ``||u-u_h||_2<h^\gamma``,
in the simplest case ``\gamma=2``.

"""

138 μs

Back of the envelope complexity estimate

Simple iteration ($\delta=2$) or preconditioned CG ($\delta=1$):

  • $\rho=1-h^\delta$

    • $\Rightarrow$ $\ln \rho \approx -h^\delta$

    • $\Rightarrow$ $k_\rho = O(h^{-\delta})$

  • $d$: space dimension:

    • $N\approx n^d$

    • $h\approx \frac1n \approx N^{-\frac1d}$

    • $\Rightarrow$ $k_\rho=O(N^{\frac{\delta}d})$

  • $O(N)$ complexity of one iteration step (e.g. Jacobi, ILU0)

  • $\Rightarrow$ Overall complexity $O(N^{1+\frac{\delta}{d}})$=$O(N^{\frac{d+\delta}{d}})$

    • Typical scaling for simple iteration scheme: $\delta=2$ (Jacobi, ILU0 $\dots$)

    • Estimate for preconditioned CG (PCG) gives $\delta=1$

md"""
__Back of the envelope complexity estimate__

Simple iteration (``\delta=2``) or preconditioned CG (``\delta=1``):

- ``\rho=1-h^\delta``
- ``\Rightarrow`` ``\ln \rho \approx -h^\delta``
- ``\Rightarrow`` ``k_\rho = O(h^{-\delta})``
- ``d``: space dimension:
- ``N\approx n^d``
- ``h\approx \frac1n \approx N^{-\frac1d}``
- ``\Rightarrow`` ``k_\rho=O(N^{\frac{\delta}d})``
- ``O(N)`` complexity of one iteration step (e.g. Jacobi, ILU0)
- ``\Rightarrow`` Overall complexity ``O(N^{1+\frac{\delta}{d}})``=``O(N^{\frac{d+\delta}{d}})``
- Typical scaling for simple iteration scheme: ``\delta=2`` (Jacobi, ILU0 ``\dots``)
- Estimate for preconditioned CG (PCG) gives ``\delta=1``

"""

835 μs

Overview on complexity estimates

$$ \begin{array}{|c|c|c|c|c|}\hline \text{Space dim} &Simple&PCG& \text{LU fact}& \text{LU solve}\\ \hline 1 & O(N^3) & O(N^2) & O(N) & O(N) \\ 2 & O(N^2) & O(N^{\frac32})& O(N^{\frac32}) & O(N\log N) \\ 3 & O(N^{\frac53}) & O(N^{\frac43})& O(N^2) & O(N^{\frac43})\\ \hline \text{Tendency with d}\uparrow & \downarrow & \downarrow & \uparrow\uparrow &\uparrow\\ \hline \end{array}$$

important"""
__Overview on complexity estimates__
```math
\begin{array}{|c|c|c|c|c|}\hline
\text{Space dim} &Simple&PCG& \text{LU fact}& \text{LU solve}\\
\hline
1 & O(N^3) & O(N^2) & O(N) & O(N) \\
2 & O(N^2) & O(N^{\frac32})& O(N^{\frac32}) & O(N\log N) \\
3 & O(N^{\frac53}) & O(N^{\frac43})& O(N^2) & O(N^{\frac43})\\
\hline
\text{Tendency with d}\uparrow & \downarrow & \downarrow & \uparrow\uparrow &\uparrow\\
\hline
\end{array}
```


"""

76.5 μs
pyplot(width=700,height=300) do
log10n=range(0,6,length=1000)
n=10 .^ log10n
N=n
h=1 ./ n
PyPlot.grid()
PyPlot.title("Complexity scaling for 1D problems")
PyPlot.ylim(1,1.0e30)
PyPlot.loglog(N,N.^3,label=L"Simple $O(N^3)$")
PyPlot.loglog(N,N.^2,label=L"PCG $O(N^2)$")
PyPlot.loglog(N,N.^1,label=L"Ideal $O(N)$")

PyPlot.loglog(N,N.^1,label=L"LU fact, $O(N)$",
linestyle="none",marker=".",markevery=50,color=:red,markersize=5)

PyPlot.loglog(N,N.^1,label=L"LU solve, $O(N)$",
linestyle="none", marker=".",markevery=(25,50),color=:black,markersize=5)
PyPlot.xlabel("Number of unknowns N")
th(x)=1 ./ x
tN(x)=1 ./ x
ax=gca()
ax2 = ax.secondary_xaxis(-0.25, functions=(th,tN))
ax2.set_xlabel("Grid spacing h")
PyPlot.ylabel("Operations")
PyPlot.legend(loc="upper left")
end

2.0 s
  • Sparse direct solvers, tridiagonal solvers are asymptotically optimal

  • Non-ideal iterative solvers significantly worse than optimal

important"""
- Sparse direct solvers, tridiagonal solvers are asymptotically optimal
- Non-ideal iterative solvers significantly worse than optimal
"""

79.5 μs
pyplot(width=700,height=300) do
log10n=range(0,6,length=1000)
n=10 .^ log10n
N=n.^2.0
h=1 ./ n
PyPlot.grid()
PyPlot.title("Complexity scaling for 2D problems")
PyPlot.ylim(1,1.0e30)
PyPlot.loglog(N,N.^2,label=L"Simple $O(N^2)$")
PyPlot.loglog(N,N.^(3/2),label=L"PCG $O(N^{\frac{3}{2}})$")
PyPlot.loglog(N,N.^1,label=L"Ideal $O(N)$")
PyPlot.loglog(N,N.^(3/2),label=L"LU fact, $O(N^{\frac{3}{2}})$",
linestyle="none",marker=".",markevery=50,color=:red,markersize=5)
PyPlot.loglog(N,N.*log10.(N),label=L"LU solve, $O(N \log N)$",
linestyle="none", marker=".",markevery=(25,50),color=:black,markersize=5)

PyPlot.xlabel("Number of unknowns N")
th(x)=1 ./ x.^(1.0/2)
tN(x)=1 ./ x.^2.0
ax=gca()
ax2 = ax.secondary_xaxis(-0.25, functions=(th,tN))
ax2.set_xlabel("Grid spacing h")
PyPlot.ylabel("Operations")
PyPlot.legend(loc="upper left")
end

392 ms
  • Sparse direct solvers better than simple nonideal iterative solvers

  • Sparse direct solvers on par preconditioned CG


important"""
- Sparse direct solvers better than simple nonideal iterative solvers
- Sparse direct solvers on par preconditioned CG
"""

75.9 μs
pyplot(width=600,height=300) do
log10n=range(0,6,length=1000)
n=10 .^ log10n
N=n.^3.0
h=1 ./ n
PyPlot.grid()
PyPlot.title("Complexity scaling for 3D problems")
PyPlot.ylim(1,1.0e30)
PyPlot.loglog(N,N.^(5/3),label=L"Simple $O(N^\frac{5}{3})$")
PyPlot.loglog(N,N.^(4/3),label=L"PCG $O(N^{\frac{4}{4}})$")
PyPlot.loglog(N,N.^1,label=L"Ideal $O(N)$")
PyPlot.loglog(N,N.^2,label=L"LU fact, $O(N^2)$",
linestyle="none",marker=".",markevery=50,color=:red,markersize=5)
PyPlot.loglog(N,N.^(4/3),label="LU solve",
linestyle="none", marker=".",markevery=(25,50),color=:black,markersize=5)
PyPlot.xlabel("Number of unknowns N")
th(x)=1 ./ x.^(1.0/3)
tN(x)=1 ./ x.^3.0
ax=gca()
ax2 = ax.secondary_xaxis(-0.25, functions=(th,tN))
ax2.set_xlabel("Grid spacing h")
PyPlot.ylabel("Operations")
PyPlot.legend(loc="upper left")
end

424 ms
  • Sparse LU factorization is expensive: going from $h$ to $h/2$ increases $N$ by a factor of 8 and operation count by a factor of 64!

  • Sparse LU solve on par preconditioned CG

important"""
- Sparse LU factorization is expensive: going from $h$ to $h/2$ increases
$N$ by a factor of 8 and operation count by a factor of 64!
- Sparse LU solve on par preconditioned CG
"""

91.8 μs

Examples

md"""
## Examples
"""
144 μs

Implementation of a Jacobi preconditioner: we need at least a constructor and ldiv! methods.

md"""
Implementation of a Jacobi preconditioner: we need at least a constructor and `ldiv!` methods.
"""
179 μs
begin
# Data structure: we store the inverse of the main diagonal
struct JacobiPreconditioner
invdiag::Vector
end
# Constructor:
function JacobiPreconditioner(A::AbstractMatrix)
n=size(A,1)
invdiag=zeros(n)
for i=1:n
invdiag[i]=1.0/A[i,i]
end
JacobiPreconditioner(invdiag)
end
# Solution of preconditioning system Mu=v
# Method name and signature are compatible to IterativeSolvers.jl
function LinearAlgebra.ldiv!(u,precon::JacobiPreconditioner,v)
invdiag=precon.invdiag
n=length(invdiag)
for i=1:n
u[i]=invdiag[i]*v[i]
end
u
end
# In-place solution of preconditioning system
LinearAlgebra.ldiv!(precon::JacobiPreconditioner,v)=ldiv!(v,precon,v)
end
2.3 ms

Implement an LU preconditoner:

md"""
Implement an LU preconditoner:
"""
142 μs
begin
# Data structure: we store the inverse of a modfied main diagonal
# and a pointer to the main diagonal entry of each column
struct ILU0Preconditioner{Tv,Ti}
A::SparseMatrixCSC{Tv,Ti}
xdiag::Vector{Tv}
idiag::Vector{Ti}
end
function ILU0Preconditioner(A)
n=size(A,1)
colptr=A.colptr
rowval=A.rowval
nzval=A.nzval
idiag=zeros(Int64,n)
xdiag=zeros(n)
# calculate main diagonal indices
for j=1:n
for k=colptr[j]:colptr[j+1]-1
i=rowval[k]
if i==j
idiag[j]=k
break
end
end
end
# calculate modified inverse main diagonal
for j=1:n
xdiag[j]=1/nzval[idiag[j]]
for k=idiag[j]+1:colptr[j+1]-1
i=rowval[k]
for l=colptr[i]:colptr[i+1]-1
if rowval[l]==j
7.2 ms

Implement a simple iteration scheme

md"""
Implement a simple iteration scheme
"""
138 μs
simple (generic function with 1 method)
begin
function simple!(u,A,b;tol=1.0e-10,log=true,maxiter=100,Pl=nothing)
res=A*u-b # initial residual
r0=norm(res) # residual norm
history=[r0] # intialize history recording
for i=1:maxiter
u=u-ldiv!(Pl,res) # solve preconditioning system and update solution
res=A*u-b # calculate residual
r=norm(res) # residual norm
push!(history,r) # record in history
if (r/r0)<tol # check for relative tolerance
return u,Dict( :resnorm => history)
end
end
return u,Dict( :resnorm =>history )
end

simple(A,b;tol=1.0e-10, log=true,maxiter=100,Pl=nothing)=simple!(zeros(length(b)),A,b,tol=tol,maxiter=maxiter,log=log,Pl=Pl)
end

3.0 ms

Test problem

md"""
### Test problem
"""
136 μs
N
10000
N=10000
9.5 μs
true
dim=3; symmetric=true
9.7 μs
n
22
n=Int(ceil(N^(1/dim)))
548 ns
A1=fdrand([n for i=1:dim]...;symmetric);
67.7 ms
b1
b1=A1*ones(size(A1,1))
136 μs
A1Jacobi=JacobiPreconditioner(A1);
88.3 μs
A1ILU0=ILU0Preconditioner(A1);
218 μs
tol
1.0e-10
tol=1.0e-10
7.7 μs

Solve the test problem with the simple iterative solver:

md"""
Solve the test problem with the `simple` iterative solver:
"""
147 μs

Convergence simple+CG

md"""
### Convergence simple+CG
"""
133 μs
maxiter
10010
maxiter=10010
9.3 μs
sol_simple_jacobi,hist_simple_jacobi=simple(A1,b1;tol,maxiter,log=true,Pl=A1Jacobi)
7.8 s
sol_simple_ilu0,hist_simple_ilu0=simple(A1,b1;tol,maxiter,log=true,Pl=A1ILU0)
1.5 s

Solve the test problem with the CG iterative solver from IterativeSolvers.jl:

md"""
Solve the test problem with the CG iterative solver from IterativeSolvers.jl:
"""
150 μs
sol_cg_jacobi,hist_cg_jacobi=cg(A1,b1; reltol=tol,log=true,maxiter,Pl=A1Jacobi)
159 ms
sol_cg_ilu0,hist_cg_ilu0=cg(A1,b1; reltol=tol,log=true,maxiter,Pl=A1ILU0)
28.2 ms
  • As we see, all CG variants converge within the given number of iterations steps.

  • Precoditioning helps

  • The better the preconditioner, the faster the iteration (though this also depends on the initial value)

  • The behaviour of the CG residual is not monotone

md"""
- As we see, all CG variants converge within the given number of iterations steps.
- Precoditioning helps
- The better the preconditioner, the faster the iteration (though this also depends on the initial value)
- The behaviour of the CG residual is not monotone
"""

346 μs
pyplot(width=700) do
semilogy(hist_simple_ilu0[:resnorm],label="simple jacobi")
semilogy(hist_simple_jacobi[:resnorm],label="simple ilu0")
semilogy(hist_cg_jacobi[:resnorm],label="cg jacobi")
semilogy(hist_cg_ilu0[:resnorm],label="cg ilu0")
xlim(0,1000)
xlabel("iteration number k")
ylabel("residual norm")
legend(loc="lower left")
PyPlot.grid()
end

42.2 ms

Convergence: ILU + bicgstab

md"""
### Convergence: ILU + bicgstab
"""
174 μs
sol_bicgstab_ilu0,hist_bicgstab_ilu0=bicgstabl(A1,b1,reltol=tol,log=true,max_mv_products=2*maxiter,Pl=A1ILU0)
39.4 ms
sol_gmres_ilu0,hist_gmres_ilu0=gmres(A1,b1;Pl=A1ILU0,reltol=tol,log=true,maxiter)
56.8 ms
pyplot(width=700) do
semilogy(hist_bicgstab_ilu0[:resnorm],label="bicgstab ilu0")
semilogy(hist_gmres_ilu0[:resnorm],label="gmres ilu0")
xlabel("iteration number k")
ylabel("residual norm")
legend(loc="lower left")
PyPlot.grid()
end
38.9 ms

Solution times

md"""
### Solution times
"""
188 μs

Compare Sparse direct solver, PCG and bicgstab:

md"""
Compare Sparse direct solver, PCG and bicgstab:
"""
191 μs
BenchmarkTools.Trial: 140 samples with 1 evaluation.
 Range (min … max):  34.411 ms … 41.491 ms  ┊ GC (min … max): 0.00% … 3.50%
 Time  (median):     35.298 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   35.716 ms ±  1.157 ms  ┊ GC (mean ± σ):  0.88% ± 1.64%

       ▁▇█   ▂                                                 
  ▄▃▅██████▇▇█▅▄▄▁▃▁▄▁▄▆▃▄▆▄▃▄▃▃▁▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▁▁▃ ▃
  34.4 ms         Histogram: frequency by time        40.7 ms <

 Memory estimate: 24.05 MiB, allocs estimate: 66.
@benchmark A1\b1
11.6 s
BenchmarkTools.Trial: 182 samples with 1 evaluation.
 Range (min … max):  26.992 ms …  32.119 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     27.550 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   27.602 ms ± 412.222 μs  ┊ GC (mean ± σ):  0.04% ± 0.52%

                 ▂▅▄▆█▅▆▂▅ ▁                                    
  ▃▃▁▃▁▁▄▁▁▁▃▄▅▆▆█████████▇█▇▃▃▄▃▁▄▁▁▃▁▁▁▁▃▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▄ ▃
  27 ms           Histogram: frequency by time         28.7 ms <

 Memory estimate: 417.19 KiB, allocs estimate: 338.
if hist_cg_ilu0.isconverged
@benchmark cg(A1,b1; reltol=tol,log=true,maxiter,Pl=A1ILU0)
end
11.3 s
BenchmarkTools.Trial: 119 samples with 1 evaluation.
 Range (min … max):  37.073 ms … 46.741 ms  ┊ GC (min … max): 0.00% … 3.56%
 Time  (median):     42.495 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   42.278 ms ±  2.115 ms  ┊ GC (mean ± σ):  1.91% ± 1.95%

                          ▆ █     ▆▄▃▆   ▄ ▆▁       ▁          
  ▄▆▁▆▁▁▁▄▄▆▁▄▆▄▄▁▆▁▆▆▆▄▄▁█▇█▆▇▁▇▁████▇▄▄█▆██▆▄▁▁▇▁▇█▁▆▁▆▁▄▁▄ ▄
  37.1 ms         Histogram: frequency by time        46.6 ms <

 Memory estimate: 28.11 MiB, allocs estimate: 586.
if hist_bicgstab_ilu0.isconverged
@benchmark bicgstabl(A1,b1,reltol=tol,log=true,max_mv_products=2*maxiter,Pl=A1ILU0)
end
11.1 s

Final remarks

md"""
## Final remarks
"""
156 μs
  • Iterative solvers are a combination of preconditioning and iteration scheme. Krylov method based iteration schemes (CG, BiCGstab, GMRES... ) provide significant advantages.

  • Iterative solvers can beat direct solvers for problems stemming from the discretization of PDEs in 3D

  • Convergence of iterative solvers needs more matrix properties than just nonsingularity

  • Parallelization is easier for iterative solvers than for sparse direct solvers

statement"""
- Iterative solvers are a combination of preconditioning and iteration scheme. Krylov method based iteration schemes (CG, BiCGstab, GMRES... ) provide significant advantages.
- Iterative solvers can beat direct solvers for problems stemming from the discretization of PDEs in 3D
- Convergence of iterative solvers needs more matrix properties than just nonsingularity
- Parallelization is easier for iterative solvers than for sparse direct solvers

"""
111 μs

Julia packages

  • Iteration schemes

    • Krylov.jl (closer to current research)

    • IterativeSolvers.jl (used in this notebook)

  • Preconditioners

    • ILUZero.jl for zero fill-in ILU decomposition

    • IncompleteLU.jl - ILU with drop tolerance

    • AlgebraicMultigrid.jl - Multigrid methods with automatic coarsening

  • LinearSolve.jl - Attempt on a "on-stop shop" for linear system solution

  • ExtendableSparse.jl - Simple+ efficient sparse matrix building + integration with preconditioners and various sparse direct solvers

statement"""
__Julia packages__
- Iteration schemes
- Krylov.jl (closer to current research)
- IterativeSolvers.jl (used in this notebook)
- Preconditioners
- ILUZero.jl for zero fill-in ILU decomposition
- IncompleteLU.jl - ILU with drop tolerance
- AlgebraicMultigrid.jl - Multigrid methods with automatic coarsening
- LinearSolve.jl - Attempt on a "on-stop shop" for linear system solution
- ExtendableSparse.jl - Simple+ efficient sparse matrix building + integration with preconditioners and various sparse direct solvers
"""
145 μs

hrule()
8.4 μs
pyplot (generic function with 1 method)
function pyplot(f;width=300,height=300)
clf()
f()
fig=gcf()
fig.set_size_inches(width/100,height/100)
fig
end

1.6 ms
begin
using PlutoUI,HypertextLiteral,LaTeXStrings,PyPlot,BenchmarkTools
PyPlot.svg(true)
end;
10.5 s
TableOfContents()
8.7 μs
begin
hrule()=html"""<hr>"""
highlight(mdstring,color)= htl"""<blockquote style="padding: 10px; background-color: $(color);">$(mdstring)</blockquote>"""
macro important_str(s) :(highlight(Markdown.parse($s),"#ffcccc")) end
macro definition_str(s) :(highlight(Markdown.parse($s),"#ccccff")) end
macro statement_str(s) :(highlight(Markdown.parse($s),"#ccffcc")) end
html"""
<style>
h1{background-color:#dddddd; padding: 10px;}
h2{background-color:#e7e7e7; padding: 10px;}
h3{background-color:#eeeeee; padding: 10px;}
h4{background-color:#f7f7f7; padding: 10px;}
pluto-log-dot-sizer { max-width: 655px;}
pluto-log-dot.Stdout { background: #002000;
color: #10f080;
border: 6px solid #b7b7b7;
min-width: 18em;
max-height: 300px;
width: 675px;
overflow: auto;
}
</style>
"""
end
423 ms