xxxxxxxxxxbegin using Pkg Pkg.activate(mktempdir()) Pkg.add("PyPlot") Pkg.add("BenchmarkTools") Pkg.add("PlutoUI") using PlutoUI,PyPlot,BenchmarkToolsendxxxxxxxxxxusing LinearAlgebraLinear System Solution
Let
be an matrix, .Solve
Direct methods:
Exact up to machine precision
Sometimes expensive, sometimes not
Iterative methods:
"Only" approximate
With good convergence and proper accuracy control, results may be not worse than for direct methods
Sometimes expensive, sometimes not
Convergence guarantee is problem dependent and can be tricky
Matrix & vector norms
Vector norms
let
: sum norm, -norm
1.5xxxxxxxxxxnorm(v,1) : Euclidean norm, -norm
1.06301
1.06301
xxxxxxxxxxnorm(v,2),norm(v) : maximum norm, -norm
1.0xxxxxxxxxxnorm(v,Inf)Matrix norms
Matrix
Representation of linear operator
defined by withVector norm
induces corresponding matrix norm:
3×3 Array{Float64,2}:
3.0 2.0 3.0
0.1 0.3 0.5
0.6 2.0 3.0xxxxxxxxxxM=[3.0 2.0 3.0; 0.1 0.3 0.5; 0.6 2.0 3.0] maximum of column sums of absolute values of entries
6.5xxxxxxxxxxopnorm(M,1) with : largest eigenvalue of .
xxxxxxxxxxmd"""- ``||A||_2=\sqrt{\lambda_{max}}`` with ``\lambda_{max}``: largest eigenvalue of ``A^TA``."""5.78018
5.78018
5.78018
xxxxxxxxxxopnorm(M,2), opnorm(M),sqrt(maximum(eigvals(M'*M))) maximum of row sums of absolute values of entries
xxxxxxxxxxmd"""- ``||A||_\infty= \max_{i=1}^n \sum_{j=1}^n |a_{ij}| = ||A^T||_1`` maximum of row sums of absolute values of entries"""8.0
8.0
xxxxxxxxxxopnorm(M,Inf),opnorm(M',1)Condition number and error propagation
Solve
, where is inexactLet
be the error in and be the resulting error in such that
Since
, we getTherefore
where
This means that the relative error in the solution is proportional to the relative error of the right hand side. The proportionality factor
xxxxxxxxxxmd"""## Condition number and error propagation- Solve ``Ax=b``, where ``b`` is inexact- Let ``\Delta b`` be the error in ``b`` and ``\Delta x`` be the resulting error in ``x`` such that ``A(x+\Delta x)=b+\Delta b.`` - Since ``Ax=b``, we get ``A\Delta x=\Delta b``- Therefore ``\Delta x=A^{-1} \Delta b`` ``||A||\cdot ||x||\geq||b||`` ``||\Delta x||\leq ||A^{-1}||\cdot ||\Delta b||`` ``\Rightarrow \frac{||\Delta x||}{||x||} \leq \kappa(A) \frac{||\Delta b||}{||b||}`` where ``\kappa(A)= ||A||\cdot ||A^{-1}||`` is the *condition number* of ``A``.This means that the relative error in the solution is proportional to the relative error of the right hand side. The proportionality factor ``\kappa(A)`` is usually larger (and in most relevant cases significantly larger) than one. Just remark thatthis estimates does not assume inexact arithmetics."""Let us have an example. We use rational arithmetics in order to perform exact calulations.
xxxxxxxxxxmd"""Let us have an example. We use rational arithmetics in order to performexact calulations."""Float64xxxxxxxxxxT_test=Float641.0e6xxxxxxxxxxa=T_test(1_000_000)1.0e-9xxxxxxxxxxpert_b=T_test(1//1_000_000_000)2×2 Array{Float64,2}:
1.0 -1.0
1.0e6 1.0e6xxxxxxxxxxA=[ 1 -1; a a]1.0e6xxxxxxxxxxκ=opnorm(A)*opnorm(inv(A))2×2 Array{Float64,2}:
0.5 5.0e-7
-0.5 5.0e-7xxxxxxxxxxinv(A)Assume a solution vector:
xxxxxxxxxxmd"""Assume a solution vector:"""1
1
xxxxxxxxxxx=[1,1]Create corresponding right hand side:
xxxxxxxxxxmd"""Create corresponding right hand side:"""0.0
2.0e6
xxxxxxxxxxb=A*xDefine a perturbation of the right hand side:
xxxxxxxxxxmd"""Define a perturbation of the right hand side:"""1.0e-9
1.0e-9
xxxxxxxxxxΔb=[pert_b, pert_b]Calculate the error with respect to the unperturbed solution:
xxxxxxxxxxmd"""Calculate the error with respect to the unperturbed solution:"""5.0e-10
-5.0e-10
xxxxxxxxxxΔx=inv(A)*(b+Δb)-xRelative error of right hand side:
xxxxxxxxxxmd"""Relative error of right hand side:"""7.071067811865475e-16xxxxxxxxxxδb=norm(Δb)/norm(b)Relative error of solution:
xxxxxxxxxxmd"""Relative error of solution:"""5.000000413703827e-10xxxxxxxxxxδx=norm(Δx)/norm(x)Comparison with condition number based estimate:
xxxxxxxxxxmd"""Comparison with condition number based estimate:"""1.0e6
7.07107e5
xxxxxxxxxxκ,δx/δbComplexity: "big O notation"
Let
Write
if there exists a constant
Often, one skips the part "
Examples:
Addition of two vectors:
Matrix-vector multiplication (for matrix where all entries are assumed to be nonzero):
xxxxxxxxxxmd"""## Complexity: "big O notation" Let ``f,g: \mathbb V \to \mathbb R^+`` be some functions, where ``\mathbb V=\mathbb N`` or ``\mathbb V=\mathbb R``. Write ```math f(x)=O(g(x)) \quad (x\to\infty) ```if there exists a constant ``C>0`` and ``x_0\in \mathbb V`` such that `` \forall x>x_0, \quad |f(x)|\leq C|g(x)|`` Often, one skips the part "``(x \to \infty)``" Examples: - Addition of two vectors: ``O(N)`` - Matrix-vector multiplication (for matrix where all entries are assumed to be nonzero): ``O(N^2)``"""A Direct method: Cramer's rule
Solve
This takes
xxxxxxxxxxmd"""## A Direct method: Cramer's ruleSolve $Ax=b$ by Cramer's rule:```mathx_i=\left| \begin{matrix} a_{11}&a_{12}&\ldots&a_{1i-1}&b_1&a_{1i+1}&\ldots&a_{1N}\\ a_{21}& &\ldots& &b_2& &\ldots&a_{2N}\\ \vdots& & & &\vdots& & &\vdots\\ a_{N1}& &\ldots& &b_N& &\ldots&a_{NN} \end{matrix}\right| / |A| \quad (i=1\dots N)```This takes $O(N!)$ operations..."""LU decomposition
Gaussian elimination
So let us have a look at Gaussian elimination to solve
xxxxxxxxxxmd"""## LU decomposition#### Gaussian elimination So let us have a look at Gaussian elimination to solve ``Ax=b``.The elementary matrix manipulation step in Gaussian elimination ist the multiplication of row k by ``-a_{jk}/a_{kk}`` and its addition to row j such that element``_{jk}`` in the resulting matrix becomes zero. If this is done at once for all ``j>k``, we can express this operation as the left multiplication of ``A`` by a lower triangular Gauss transformation matrix ``M(A,k)``. """xxxxxxxxxxfunction gausstransform(A,k) n=size(A,1) M=Matrix(one(eltype(A))I,n,n) for j=k+1:n M[j,k]=-A[j,k]/A[k,k] end Mend;Define the number type for the following examples:
xxxxxxxxxxmd"""Define the number type for the following examples:"""RationalxxxxxxxxxxT_lu=RationalDefine a test matrix:
xxxxxxxxxxmd"""Define a test matrix:"""4×4 Array{Rational,2}:
2//1 1//1 3//1 4//1
5//1 6//1 7//1 8//1
7//1 6//1 8//1 5//1
3//1 4//1 2//1 2//1xxxxxxxxxxA1=T_lu[2 1 3 4; 5 6 7 8; 7 6 8 5; 3 4 2 2;] This is the Gauss transform for first column:
xxxxxxxxxxmd"""This is the Gauss transform for first column:"""4×4 Array{Rational{Int64},2}:
1//1 0//1 0//1 0//1
-5//2 1//1 0//1 0//1
-7//2 0//1 1//1 0//1
-3//2 0//1 0//1 1//1xxxxxxxxxxgausstransform(A1,1)Applying it then sets all elements in the first column to zero besides of the main diagonal element:
xxxxxxxxxxmd"""Applying it then sets all elements in the first column to zero besides of the main diagonal element:"""4×4 Array{Rational,2}:
2//1 1//1 3//1 4//1
0//1 7//2 -1//2 -2//1
0//1 5//2 -5//2 -9//1
0//1 5//2 -5//2 -4//1xxxxxxxxxxU1=gausstransform(A1,1)*A1We can repeat this with the second column:
xxxxxxxxxxmd"""We can repeat this with the second column:"""4×4 Array{Rational,2}:
2//1 1//1 3//1 4//1
0//1 7//2 -1//2 -2//1
0//1 0//1 -15//7 -53//7
0//1 0//1 -15//7 -18//7xxxxxxxxxxU2=gausstransform(U1,2)*U1And the third column:
xxxxxxxxxxmd"""And the third column:"""4×4 Array{Rational,2}:
2//1 1//1 3//1 4//1
0//1 7//2 -1//2 -2//1
0//1 0//1 -15//7 -53//7
0//1 0//1 0//1 5//1xxxxxxxxxxU3=gausstransform(U2,3)*U2And here, we arrived at a triangular matrix. In the standard Gaussian elimination we would have manipulated the right hand side accordingly.
From here on we would start the backsubstitution which in fact is the solution of a triangular system of equations.
However, let's have a look at what we have done here: we arrived at
Thus,
xxxxxxxxxxmd"""And here, we arrived at a triangular matrix. In the standard Gaussian elimination we would have manipulated the right hand side accordingly. From here on we would start the backsubstitution which in fact is the solution of a triangular system of equations.However, let's have a look at what we have done here: we arrived at```math\begin{align}U_1&=M(A,1)A\\U_2&=M(U_1,2)U_1=M(U_1,2)M(A,1)A\\U_3&=M(U_2,3)U_2=M(U_2,3)M(U_1,2)M(A,1)A\end{align}```Thus, ``A=LU`` with ``L=M(A,1)^{-1}M(U_1,2)^{-1}M(U_2,3)^{-1}`` and ``U=U_3````L`` is a lower triangular matrix and ``U`` is a upper triangular matrix."""A first LU decomposition
We can put this together into a function:
xxxxxxxxxxmd"""#### A first LU decompositionWe can put this together into a function:"""xxxxxxxxxxfunction my_first_lu_decomposition(A) n=size(A,1) L=Matrix(one(eltype(A))I,n,n) # L=I U=A for k=1:n-1 M=gausstransform(U,k) L=L*inv(M) U=M*U end L,Uend;4×4 Array{Rational{Int64},2}:
1//1 0//1 0//1 0//1
5//2 1//1 0//1 0//1
7//2 5//7 1//1 0//1
3//2 5//7 1//1 1//14×4 Array{Rational,2}:
2//1 1//1 3//1 4//1
0//1 7//2 -1//2 -2//1
0//1 0//1 -15//7 -53//7
0//1 0//1 0//1 5//1xxxxxxxxxxLx,Ux=my_first_lu_decomposition(A1)Check for correctness:
xxxxxxxxxxmd"""Check for correctness:"""4×4 Array{Rational{Int64},2}:
0//1 0//1 0//1 0//1
0//1 0//1 0//1 0//1
0//1 0//1 0//1 0//1
0//1 0//1 0//1 0//1xxxxxxxxxxLx*Ux-A1So now we can write
xxxxxxxxxxmd"""So now we can write ``A=LU``. Solving ``LUx=b`` then amounts to solve two triangular systems:```math\begin{align} Ly&=b\\ Ux&=y\end{align}```"""my_first_lu_solve (generic function with 1 method)xxxxxxxxxxfunction my_first_lu_solve(L,U,b) y=inv(L)*b x=inv(U)*yend1
2
3
4
xxxxxxxxxxb1=[1,2,3,4]182//75
-7//75
-154//75
3//5
xxxxxxxxxxx1=my_first_lu_solve(Lx,Ux,b1)Check...
xxxxxxxxxxmd"""Check..."""0//1
0//1
0//1
0//1
xxxxxxxxxxA1*x1-b1... in order to be didactical, in this example, we made a very inefficient implementation by creating matrices in each step. We even cheated by using inv in order to solve a triangular system.
xxxxxxxxxxmd"""... in order to be didactical, in this example, we made a very inefficient implementation by creating matrices in each step. We even cheated by using `inv` in order to solve a triangular system."""A reasonable implementation
Doolittles method (Adapted from wikipedia: LU_decomposition)
This allows to perfrom LU decomposition in-place.
xxxxxxxxxxmd"""#### A reasonable implementation- Doolittles method (Adapted from [__wikipedia: LU_decomposition__](https://en.wikipedia.org/wiki/LU_decomposition#MATLAB_code_examples))- This allows to perfrom LU decomposition in-place."""better_lu_decomposition! (generic function with 1 method)xxxxxxxxxxfunction better_lu_decomposition!(LU) n = size(LU,1) # decomposition of matrix, Doolittle’s Method for i = 1:n for j = 1:(i - 1) alpha = LU[i,j]; for k = 1:(j - 1) alpha = alpha - LU[i,k]*LU[k,j]; end LU[i,j] = alpha/LU[j,j]; end for j = i:n alpha = LU[i,j]; for k = 1:(i - 1) alpha = alpha - LU[i,k]*LU[k,j]; end LU[i,j] = alpha; end endendbetter_lu_solve (generic function with 1 method)xxxxxxxxxxfunction better_lu_solve(LU,b) n = length(b); x = zeros(eltype(LU),n); y = zeros(eltype(LU),n); # LU= L+U-I # find solution of Ly = b for i = 1:n alpha = zero(eltype(LU)); for k = 1:i alpha = alpha + LU[i,k]*y[k]; end y[i] = b[i] - alpha; end # find solution of Ux = y for i = n:-1:1 alpha = zero(eltype(LU)); for k = (i + 1):n alpha = alpha + LU[i,k]*x[k]; end x[i] = (y[i] - alpha)/LU[i, i]; end xendWe can then implement a method for linear system solution:
xxxxxxxxxxmd"""We can then implement a method for linear system solution:"""better_solve (generic function with 1 method)xxxxxxxxxxfunction better_solve(A,b) LU=copy(A) better_lu_decomposition!(LU) better_lu_solve(LU,b)end182//75
-7//75
-154//75
3//5
xxxxxxxxxxx2=better_solve(A1,b1)0//1
0//1
0//1
0//1
xxxxxxxxxxA1*x2-b1Pivoting
So far, we ignored the possibility that a diagonal element becomes zero during the LU factorization procedure.
Pivoting tries to remedy the problem that during the algorithm, diagonal elements can become zero. Before undertaking the next Gauss transformation step, we can exchange rows such that we always dividy by the largest of the remaining diagonal elements. This would then in fact result in a decompositon
where
Almost all practically used LU decomposition implementations use partial pivoting.
xxxxxxxxxxmd"""### PivotingSo far, we ignored the possibility that a diagonal element becomes zero during the LU factorization procedure.Pivoting tries to remedy the problem that during the algorithm, diagonal elements can become zero. Before undertaking the next Gauss transformation step, we can exchange rows such that we always dividy by the largest of the remaining diagonal elements.This would then in fact result in a decompositon```mathPA=LU```where ``P`` is a permutation matrix which can be stored in an integer vector. This approach is called "partial pivoting". Full pivoting in addition would perform columnpermutations. This would result in another permutation matrix ``Q`` and the decomposition```mathPAQ=LU```Almost all practically used LU decomposition implementations use partial pivoting."""LU Factorization from Julia library
Julia implements a pivoting LU factorization
xxxxxxxxxxmd"""### LU Factorization from Julia libraryJulia implements a pivoting LU factorization"""LU{Rational,Array{Rational,2}}
L factor:
4×4 Array{Rational,2}:
1//1 0//1 0//1 0//1
5//7 1//1 0//1 0//1
3//7 5//6 1//1 0//1
2//7 -5//12 -1//2 1//1
U factor:
4×4 Array{Rational,2}:
7//1 6//1 8//1 5//1
0//1 12//7 9//7 31//7
0//1 0//1 -5//2 -23//6
0//1 0//1 0//1 5//2xxxxxxxxxxlu1=lu(A1)Like in matlab, the backslash opertor "solves", in this case it solves the LU factorization:
xxxxxxxxxxmd"""Like in matlab, the backslash opertor "solves", in this case it solves the LU factorization:"""182//75
-7//75
-154//75
3//5
xxxxxxxxxxlu1\b1Of course we can apply \ directly to a matrix. However, behind this always LU decomposition and LU solve are called:
xxxxxxxxxxmd"""Of course we can apply `\` directly to a matrix. However, behind this always LU decomposition and LU solve are called:"""182//75
-7//75
-154//75
3//5
xxxxxxxxxxx3=A1\b10//1
0//1
0//1
0//1
xxxxxxxxxxA1*x3-b1LU vs. inv
xxxxxxxxxxmd"""### LU vs. inv"""In principle we could work with the inverse matrix as well:
xxxxxxxxxxmd"""In principle we could work with the inverse matrix as well:"""4×4 Array{Rational{Int64},2}:
68//75 -2//3 2//25 49//75
-43//75 1//3 -2//25 1//75
-46//75 1//3 6//25 -53//75
2//5 0//1 -1//5 1//5xxxxxxxxxxA1inv=inv(A1)182//75
-7//75
-154//75
3//5
xxxxxxxxxxA1inv*b1However, inversion is more complex than the LU factorization.
xxxxxxxxxxmd"""However, inversion is more complex than the LU factorization."""Some performance tests.
We generate matrices
xxxxxxxxxxmd"""### Some performance tests. We generate matrices """rand_Ab (generic function with 1 method)xxxxxxxxxxrand_Ab(n)=(100.0I(n)+rand(n,n),rand(n))xxxxxxxxxxfunction perftest_lu(n) A,b=rand_Ab(n) A\bend;perftest_inv (generic function with 1 method)xxxxxxxxxxfunction perftest_inv(n) A,b=rand_Ab(n) inv(A)*bendperftest_better (generic function with 1 method)xxxxxxxxxxfunction perftest_better(n) A,b=rand_Ab(n) better_solve(A,b)endtest_and_plot (generic function with 1 method)xxxxxxxxxxfunction test_and_plot(pmax) N= 2 .^collect(5:pmax) t_inv=perftest_inv.(N) t_lu=perftest_lu.(N) clf() loglog(N,t_inv,"-o",label="inv(A)*b") loglog(N,t_lu,"-o",label="A\\b") loglog(N,1.0e-9*N.^2.75,"k--",label="O(\$N^{2.75}\$)") if pmax<12 t_b=perftest_better.(N) loglog(N,t_b,"-o",label="\"better\"") loglog(N,1.0e-9*N.^3,"k-",label="O(\$N^{3}\$)") end xlabel("Number of unknowns N") ylabel("Execution time t/s") title("Experimental complexity of dense linear system solution") grid() legend(loc="upper left") gcf()endxxxxxxxxxxtest_and_plot(11)The overall complexity in this experiment is around
which is in the region of some theoretical estimates.A good implementation is hard to get right, straightforward code performs worse than the system implementation
Using inversion instead of
\is significantly slower (log scale in the plot!)For standard floating point types, Julia uses highly optimized versions of LINPACK and BLAS
Same for python/numpy and many other coding environments