/GMRES

GMRES algorithm implementation for the numerical solution of an indefinite non-symmetric system of linear equations.

Primary LanguageMATLABGNU General Public License v3.0GPL-3.0

Generalized minimal residual method (GMRES) implementation

Index

  1. Introduction
  2. Generalized minimal residual method
  3. Implementation
  4. Installation Guide
  5. Execution Guide
  6. References

Introduction

In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite non-symmetric system of linear equations.
The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

Krylov Subspace

In linear algebra, the order-r Krylov subspace generated by an n-by-n matrix A and a vector b of dimension n is the linear subspace spanned by the images of b under the first r powers of A (starting from $A^{0}=I$ ), that is,

$$\mathcal{K}_{r}(A,b) = \text{span} \lbrace b,Ab,A^{2} b, \ldots , A^{r-1}b \rbrace .$$

Modern iterative methods such as Arnoldi iteration can be used for finding one (or a few) eigenvalues of large sparse matrices or solving large systems of linear equations. They try to avoid matrix-matrix operations, but rather multiply vectors by the matrix and work with the resulting vectors. Starting with a vector ${\displaystyle b}$, one computes ${\displaystyle Ab}$, then one multiplies that vector by ${\displaystyle A}$ to find ${\displaystyle A^{2}b}$ and so on. All algorithms that work this way are referred to as Krylov subspace methods; they are among the most successful methods currently available in numerical linear algebra.

Arnoldi Iteration

In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of an iterative method. Arnoldi finds an approximation to the eigenvalues and eigenvectors of general (possibly non-Hermitian) matrices by constructing an orthonormal basis of the Krylov subspace, which makes it particularly useful when dealing with large sparse matrices.

The Arnoldi method belongs to a class of linear algebra algorithms that give a partial result after a small number of iterations, in contrast to so-called direct methods which must complete to give any useful results (see for example, Householder transformation). The partial result in this case being the first few vectors of the basis the algorithm is building.


Generalized minimal residual method

The GMRES method was developed by Yousef Saad and Martin H. Schultz in 1986. It is a generalization and improvement of the MINRES method due to Paige and Saunders in 1975. The MINRES method requires that the matrix is symmetric, but has the advantage that it only requires handling of three vectors. GMRES is a special case of the DIIS method developed by Peter Pulay in 1980 (DIIS is applicable to non-linear system).

Method Description

Denote the Euclidean norm of any vector $v$ by ${\displaystyle |v|}$. Denote the (square) system of linear equations to be solved by

$$ {\displaystyle Ax=b.} $$

The matrix A is assumed to be invertible of size m-by-m. Furthermore, it is assumed that b is normalized, i.e., that ${\displaystyle |b|=1}$ .

The n-th Krylov subspace for this problem is

$$K_{n}=K_{n}(A,r_{0})= \text{span} \lbrace r_{0},Ar_{0},A^{2}r_{0},\ldots ,A^{n-1}r_{0} \rbrace$$

where ${\displaystyle r_{0}=b-Ax_{0}}$ is the initial error given an initial guess ${\displaystyle x_{0}\neq 0}$. Clearly ${\displaystyle r_{0}=b}$ if ${\displaystyle x_{0}=0}$.

GMRES approximates the exact solution of ${\displaystyle Ax=b}$ by the vector ${\displaystyle x_{n}\in K_{n}}$ that minimizes the Euclidean norm of the residual ${\displaystyle r_{n}=b-Ax_{n}}$.

The vectors ${\displaystyle r_{0},Ar_{0},\ldots A^{n-1}r_{0}}$ might be close to linearly dependent, so instead of this basis, the Arnoldi iteration is used to find orthonormal vectors $q_{1},q_{2},\ldots ,q_{n}$, which form a basis for ${\displaystyle K_{n}}$. In particular, $q_1 = ||r_{0}||_{2}^{-1} r_0$ .

Therefore, the vector ${\displaystyle x_{n}\in K_{n}}$ can be written as ${\displaystyle x_{n}=x_{0}+Q_{n}y_{n}}$ with ${\displaystyle y_{n}\in \mathbb {R} ^{n}}$ , where ${\displaystyle Q_{n}}$ is the m-by-n matrix formed by ${\displaystyle q_{1},\ldots ,q_{n}}$ .

The Arnoldi process also produces an $({\displaystyle n+1})\text{-by-}n$ upper Hessenberg matrix ${\displaystyle {\tilde {H}}_{n}}$ with

$$ {\displaystyle AQ_{n}=Q_{n+1}{\tilde {H}}_{n}.} $$

For symmetric matrices, a symmetric tri-diagonal matrix is actually achieved, resulting in the minres method.

Because columns of ${\displaystyle Q_{n}}$ are orthonormal, we have

$$\begin{equation} \begin{split} \|r_{n}\|=\|b-Ax_{n}\|=\|b-A(x_{0}+Q_{n}y_{n})\|= \|r_{0}-AQ_{n}y_{n}\|=\|\beta q_{1}-AQ_{n}y_{n}\| & \\ = \|\beta q_{1}-Q_{n+1}{\tilde {H}}_{n}y_{n}\| = \|Q_{n+1}(\beta e_{1}-{\tilde {H}}_{n}y_{n})\|= \|\beta e_{1}-{\tilde {H}}_{n}y_{n}\|, & \\\ \end{split} \end{equation}$$

where

$$ e_{1}=(1,0,0,\ldots ,0)^{T}, $$

is the first vector in the standard basis of ${\displaystyle \mathbb {R} ^{n+1}}$ , and

$$ {\displaystyle \beta =|r_{0}|,,} $$

$r_{0}$ being the first trial vector (usually zero). Hence, ${\displaystyle x_{n}}$ can be found by minimizing the Euclidean norm of the residual

$${\displaystyle r_{n}={\tilde {H}}_{n}y_{n}-\beta e_{1}.}$$

This is a linear least squares problem of size n.

This yields the GMRES method. On the ${\displaystyle n}$-th iteration:

1. calculate q_n with the Arnoldi method;
2. find the y_n which minimizes ||r_n||;
3. compute x_n = x_0 + Q_n y_n;
4. repeat if the residual is not yet small enough

At every iteration, a matrix-vector product ${\displaystyle Aq_{n}}$ must be computed. This costs about ${\displaystyle 2m^{2}}$ floating-point operations for general dense matrices of size ${\displaystyle m}$, but the cost can decrease to ${\displaystyle O(m)}$ for sparse matrices. In addition to the matrix-vector product, ${\displaystyle O(nm)}$ floating-point operations must be computed at the n-th iteration.

The least squares problem

One part of the GMRES method is to find the vector ${\displaystyle y_{n}}$ which minimizes

$${\displaystyle \|{\tilde {H}}_{n}y_{n}-\beta e_{1}\|.\,}$$

In some practical implementation this part is solved using Givens rotation but in this vanilla one we do without it in order to make it easier to understand.

Convergence

One of the most important characteristics of GMRES is that it will always arrive at an exact solution (if one exists). At the $n$-th iteration, GMRES computes the best approximate solution to $Ax = b$ for $x_n \in K_n$. If A is full rank, then $K_m = F^m$, so the $m$-th iteration will always return an exact answer.
Sometimes, the exact solution $x \in K_n$ for some $n < m$, in this case $x_n$ is an exact solution. In either case, the algorithm is convergent after n steps if the $n$-th residual is sufficiently small.

This does not happen in general. Indeed, a theorem of Greenbaum, Pták and Strakoš states that for every nonincreasing sequence $a_1, ..., a_{m−1}, a_m = 0$, one can find a matrix A such that the $||r_n|| = a_n$ for all $n$ , where $r_n$ is the residual defined above. In particular, it is possible to find a matrix for which the residual stays constant for $m − 1$ iterations, and only drops to zero at the last iteration.

In practice, though, GMRES often performs well. This can be proven in specific situations. If the symmetric part of A, that is ${\displaystyle (A^{T}+A)/2}$, is positive definite, then

$${\displaystyle \|r_{n}\|\leq \left(1-{\frac {\lambda _{\min }^{2}(1/2(A^{T}+A))}{\lambda _{\max }(A^{T}A)}}\right)^{n/2}\|r_{0}\|,}$$

where ${\displaystyle \lambda _{\mathrm {min} }(M)}$ and ${\displaystyle \lambda _{\mathrm {max} }(M)}$ denote the smallest and largest eigenvalue of the matrix ${\displaystyle M}$, respectively.

If A is symmetric and positive definite, then we even have

$${\displaystyle \|r_{n}\|\leq \left({\frac {\kappa _{2}(A)^{2}-1}{\kappa _{2}(A)^{2}}}\right)^{n/2}\|r_{0}\|.}$$

where ${\displaystyle \kappa _{2}(A)}$ denotes the condition number of A in the Euclidean norm.

In the general case, where A is not positive definite, we have

$${\displaystyle {\frac {\|r_{n}\|}{\|b\|}}\leq \inf _{p\in P_{n}}\|p(A)\|\leq \kappa _{2}(V)\inf _{p\in P_{n}}\max _{\lambda \in \sigma (A)}|p(\lambda )|,\,}$$

where $P_n$ denotes the set of polynomials of degree at most n with $p(0) = 1$ , $V$ is the matrix appearing in the spectral decomposition of $A$ , and $\sigma(A)$ is the spectrum of $A$ . Roughly speaking, this says that fast convergence occurs when the eigenvalues of $A$ are clustered away from the origin and $A$ is not too far from normality.

All these inequalities bound only the residuals instead of the actual error, that is, the distance between the current iterate $x_n$ and the exact solution.

GMRES with Restart

The frst few iterations of GMRES have low spatial and temporal complexity. However, as $k$ increases, the $k$-th iteration of GMRES becomes more expensive temporally and spatially. In fact, computing the $k$-th iteration of GMRES for very large $k$ can be prohibitively complex.

This issue is addressed by using GMRES(k), or GMRES with restarts. When $k$ becomes large, this algorithm restarts GMRES with an improved initial guess. The new initial guess is taken to be the vector that was found upon termination of the last GMRES iteration run. The algorithm GMRES(k) will always have manageable spatial and temporal complexity, but it is less reliable than GMRES. If the true solution $x$ to $Ax = b$ is nearly orthogonal to the Krylov subspaces $K_n(A, b)$ for $n \le k$, then GMRES(k) could converge very slowly or not at all.


Implementation

The project is composed by two part:

  1. Arnoldi Iteration implementation and testing: in the Arnoldi_Iteration/ folder 📂 there is a .ipynb notebook where has been implemented the Arnoldi Iteration algorithm.
    The main scope of this notebook is to demonstrate that with the Arnoldi Iteration is possible to approximate some eigenvalue in a matrix. This application is usefull, above all, for large and sparse matrix, where other method (like Power Method) can't be applied. We show that the eigenvalue found by Arnoldi Iteration and Power method for small matrices is the same. But, with very huge matrices the Power Method is inapplicable while the Arnoldi Iteration can find a good approximation for the bigger eigevalue of the matrix (with a sufficient iteration number). In the notebook are reported the results and the implementations.
  2. GMRES algorithm implementation and testing: in the GMRES/ folder 📂 there is a .ipynb notebook where has been implemented the GMRES algorithm. This time the code is written in Octave so, in order to execute the code in your machine, you can follow the Installation and Testing Guide in the section below. The notebook is also formattend in an Ocatve file so, you can execute only that without install any docker container.
    The main scope of this notebook is to show the possible application, the operation, the advantages and the drawbacks of this algorithm. The notebook start with a brief part in which we explained the possible way to load very huge sparse matrix: you can use some examples matrix from https://sparse.tamu.edu/, or you can generate it randomly in different ways (with or without conditioning).
    The file proceed with the GMRES standard implementation and the GMRES with Restarting implementation. The remaining part of the notebook contains a lot of tests that we have done in order to show, in easy way, when the algoritmh converges and when not. You can see all the results in the notebook but, to summarise, we report a brief explanation here:
    We show that the GMRES algorithm does't converge with any matrix (for clarity, usually it never converge in is standard form). Of course it has been demonstrate that the algorithm always converge with a number of iteration equal to the dimension of the matrix. But in a real application is inapplicable when the matrix is huge. The particular cases, when the algorithm converge, are when the eigenvalues of the matrices are clustered near the center of the plane, in the other cases it never converges. To achieve this problem, some tecniques has been proposed and implemented in the years, i.e. is possible to applicate a conditioning on the matrices that leads the eigenvalue to be cluestered near the center of the plan.

Installation Guide

To prepare the system to run the project, it is necessary something that is able to execute a Jupyter Notebook with all dependacies and requirements.
You can install it by yourself but, to make this work easier, we propose a docker container with everything you need. To install it you have to move in docker_container/ 📂 folder and run the following command:

docker compose up

Now you have a system able to execute all the code 🦧.

Execution Guide

To start the docker you have to run

docker start -a docker_container-octave-1

To run the notebook you have to open the jupyter web page following the link that terminal return to you after starting the specific docker container, for example:

http://127.0.0.1:8888/lab?token=c8c09f0b3ce7078426536c1152713b850a24363715eca933

Now move to the notebook that you want to execute and open it (from web page). You also have to choose the correct jupyter kernel:

  • for the GMRES algorithm you have to choose the Octave Kernel,
  • for the Arnoldi Iteration the python kernel

Now everything is done.

To stop the docker container you have to run:

docker stop docker_container-octave-1

References


Authors

cosci vescera fagiolo
Cristian Cosci 🐔 Nicolò Vescera 🦧 Fabrizio Fagiolo 🐛