Turbine is the set of development tools that the SHARK Team is building for deploying all of our models for deployment to the cloud and devices. We are building it as we transition from our TorchScript-era 1-off export and compilation to a unified approach based on PyTorch 2 and Dynamo. While we use it heavily ourselves, it is intended to be a general purpose model compilation and execution tool.
Turbine provides three primary tools:
- AOT Export: For compiling one or more
nn.Module
s to compiled, deployment ready artifacts. This operates via both a simple one-shot export API for simple models and an underlying advanced API for complicated models and accessing the full features of the runtime. - Eager Execution: A
torch.compile
backend is provided and a Turbine Tensor/Device is available for more native, interactive use within a PyTorch session. - Turbine Kernels: (coming soon) A union of the Triton approach and Pallas but based on native PyTorch constructs and tracing. It is intended to complement for simple cases where direct emission to the underlying, cross platform, vector programming model is desirable.
Under the covers, Turbine is based heavily on IREE and torch-mlir and we use it to drive evolution of both, upstreaming infrastructure as it becomes timely to do so.
Turbine is under active development. If you would like to participate as it comes online,
please reach out to us on the #turbine
channel of the
nod-ai Discord server.
- Install from source:
pip install shark-turbine
# Or editable: pip install -e .
The above does install some unecessary cuda/cudnn packages for cpu use. To avoid this you can specify pytorch-cpu and install via:
pip install --index-url https://download.pytorch.org/whl/cpu \
-r pytorch-cpu-requirements.txt \
-r torchvision-requirements.txt
pip install shark-turbine
(or follow the "Developers" instructions below for installing from head/nightly)
- Try one of the samples:
Generally, we use Turbine to produce valid, dynamic shaped Torch IR (from the
torch-mlir torch
dialect
with various approaches to handling globals). Depending on the use-case and status of the
compiler, these should be compilable via IREE with --iree-input-type=torch
for
end to end execution. Dynamic shape support in torch-mlir is a work in progress,
and not everything works at head with release binaries at present.
- AOT MLP With Static Shapes
- AOT MLP with a dynamic batch size
- AOT llama2: Dynamic sequence length custom compiled module with state management internal to the model.
If only looking to develop against this project, then you need to install Python deps for the following:
- PyTorch
- iree-compiler (with Torch input support)
- iree-runtime
The pinned deps at HEAD require pre-release versions of all of the above, and
therefore require additional pip flags to install. Therefore, to satisfy
development, we provide a requirements.txt
file which installs precise
versions and has all flags. This can be installed prior to the package:
Installing into a venv is highly recommended.
pip install --upgrade -r requirements.txt
pip install --upgrade -e .[torch-cpu-nightly,testing]
Run tests:
pytest
If doing native development of the compiler, it can be useful to switch to source builds for iree-compiler and iree-runtime.
In order to do this, check out IREE and follow the instructions to build from source, making sure to specify additional options:
-DIREE_BUILD_PYTHON_BINDINGS=ON -DPython3_EXECUTABLE="$(which python)"
Uninstall existing packages:
pip uninstall iree-compiler
pip uninstall iree-runtime
Copy the .env
file from iree/
to this source directory to get IDE
support and add to your path for use from your shell:
source .env && export PYTHONPATH