/dace

DaCe - Data Centric Parallel Programming

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

General Tests GPU Tests FPGA Tests Documentation Status PyPI version codecov

DaCe - Data-Centric Parallel Programming

Decoupling domain science from performance optimization.

DaCe is a fast parallel programming framework that takes code in Python/NumPy and other programming languages, and maps it to high-performance CPU, GPU, and FPGA programs, which can be optimized to achieve state-of-the-art. Internally, DaCe uses the Stateful DataFlow multiGraph (SDFG) data-centric intermediate representation: A transformable, interactive representation of code based on data movement. Since the input code and the SDFG are separate, it is possible to optimize a program without changing its source, so that it stays readable. On the other hand, transformations are customizable and user-extensible, so they can be written once and reused in many applications. With data-centric parallel programming, we enable direct knowledge transfer of performance optimization, regardless of the application or the target processor.

DaCe generates high-performance programs for:

  • Multi-core CPUs (tested on Intel, IBM POWER9, and ARM with SVE)
  • NVIDIA GPUs and AMD GPUs (with HIP)
  • Xilinx and Intel FPGAs

DaCe can be written inline in Python and transformed in the command-line/Jupyter Notebooks or SDFGs can be interactively modified using our Visual Studio Code extension.

Quick Start

Install DaCe with pip: pip install dace

Having issues? See our full Installation and Troubleshooting Guide.

Using DaCe in Python is as simple as adding a @dace decorator:

import dace
import numpy as np

@dace
def myprogram(a):
    for i in range(a.shape[0]):
        a[i] += i
    return np.sum(a)

Calling myprogram with any NumPy array or GPU array (e.g., PyTorch, Numba, CuPy) will generate data-centric code, compile, and run it. From here on out, you can optimize (interactively or automatically), instrument, and distribute your code. The code creates a shared library (DLL/SO file) that can readily be used in any C ABI compatible language (C/C++, FORTRAN, etc.).

For more information on how to use DaCe, see the samples or tutorials below:

Publication

The paper for the SDFG IR can be found here. Other DaCe-related publications are available on our website.

If you use DaCe, cite us:

@inproceedings{dace,
  author    = {Ben-Nun, Tal and de~Fine~Licht, Johannes and Ziogas, Alexandros Nikolaos and Schneider, Timo and Hoefler, Torsten},
  title     = {Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures},
  year      = {2019},
  booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
  series = {SC '19}
}

Contributing

DaCe is an open-source project. We are happy to accept Pull Requests with your contributions! Please follow the contribution guidelines before submitting a pull request.

License

DaCe is published under the New BSD license, see LICENSE.