The matrix exponential of an operator acting on a vector is an important
problem in mathematics, physics and engineering. Many problems are reduced
to an ODE of the form
with v(t) a time dependent vector and A a linear operator.
The solution v(t) is subject to certain initial conditions.
The solution of this equation can be formally written as
So, calculating the matrix exponential will provide use with the solution.
For small t << 1 , approximations like Pade
or Crank-Nicholson of the exponential work really good. Problem is
that for the application that I am interested in, the matrix A
is a huge sparse matrix and the Pade or similar approximation only work for
small matrices.
Since, the matrix A is sparse, the exponential can be calculated
based on the Lanczos algorithm. By applying A multiple
time to the initial state v, and using a gramm-schmidt like procedure,
the Lanczos algorithm creates an orthogonal basis for the vector
space
The operation of the matrix exponential on v is then calculated
in this subspace. This is possible since the representation
of A in this limited subspace is again a small matrix. And the exponential
of that small matrix can be calculated with the conventional Pade approximation
.
The code of this procedure is pretty easy: you essential have to write
a matrix-vector product routine that implement the application
of A on vector v. I have this code running
on a single processor machine.
So in order to rewrite a the matrix exponential on a parallel machine I need to calculate the Matrix-vector product in parallel. Therefore, the application that I am interested in is how do I organise sparse matrix-vector multiplication. Its obvious that this is not the first application that relies heavily on this multiplication.
Because the matrix is sparse it looks very difficult to find the optimal
way of programming the matrix-vector product, since it depends critically
on the distribution of non-zero elements.
We dont want to use the generalisation of the uniprocesser BLAS routine
dgemv. For example PLAPACK has the call PLA_dgemv or SCALAPACK
has PSgemv. These routines are optimized for dense matrices and
don't take advantage of the nonzero elements in the matrix A. In addition
in our application A is so big that we never want to contstruct it. We prefer
an construction on-the-fly when we calculate the matrix product.