DBCSR is a library designed to efficiently perform sparse matrix matrix multiplication, among other operations. It is MPI and OpenMP parallel and can exploit GPUs via CUDA.
You absolutely need:
- GNU make
- a Fortran compiler which supports at least Fortran 2003 (respectively 2008+TS when using the C-bindings)
- a LAPACK implementation (reference, OpenBLAS-bundled and MKL have been tested)
- a BLAS implementation (reference, OpenBLAS-bundled and MKL have been tested)
- a Python version installed (2.7 or 3.6+ have been tested)
Optionally:
- libxsmm (1.8.2+ with make-only, 1.10+ with cmake) for Small Matrix Multiplication acceleration
- CMake (3.10+)
To build with CUDA support you further need:
- CUDA Toolkit
- a C++ compiler which supports at least C++11 standard
We test against GNU and Intel compilers on Linux systems, GNU compiler on MacOS systems.
Download either a release tarball or clone the latest version from Git using:
git clone --recursive https://github.com/cp2k/dbcsr.git
Run
make help
to list all possible targets.
Update the provided Makefile.inc to fit your needs (read the documentation inside the file for further explanations) and then run
make <target>
Some examples on how to use the library (which is the only current documentation) are available under the examples directory (see readme).
You can compile with
make CINT=1
to generate the C interface. Make sure your Fortran compiler supports F2008 standard (including the TS) by updating the flag in the Makefile.inc.
Building with CMake is also supported:
mkdir build
cd build
cmake ..
The configuration flags are (default first):
-DUSE_MPI=<ON|OFF>
-DUSE_OPENMP=<ON|OFF>
-DUSE_SMM=<blas|libxsmm>
-DUSE_CUDA=<OFF|ON>
-DWITH_C_API=<ON|OFF>
-DWITH_EXAMPLES=<ON|OFF>
-DWITH_GPU=<P100|K20X|K40|K80>
-DTEST_MPI_RANKS=<auto,N>
Your contribution to the project is welcome! Please see Contributing.md and this wiki page. For any help, please notify the other developers.