dstack
is an open-source engine for running GPU workloads on any cloud.
It works with a wide range of cloud GPU providers (AWS, GCP, Azure, Lambda, TensorDock, Vast.ai, etc.)
as well as on-premises servers.
- [2024/01] dstack 0.15.1: Kubernetes integration preview (Release)
- [2024/01] dstack 0.15.0: Resources, authentication, and more (Release)
- [2024/01] dstack 0.14.0: OpenAI-compatible endpoints preview (Release)
- [2023/12] dstack 0.13.0: Disk size, CUDA 12.1, Mixtral, and more (Release)
Before using dstack
through CLI or API, set up a dstack
server.
The easiest way to install the server, is via pip
:
pip install "dstack[all]" -U
If you have default AWS, GCP, or Azure credentials on your machine, the dstack
server will pick them up automatically.
Otherwise, you need to manually specify the cloud credentials in ~/.dstack/server/config.yml
.
For further details on setting up the server, refer to installation.
To start the server, use the dstack server
command:
$ dstack server
Applying ~/.dstack/server/config.yml...
The admin token is "bbae0f28-d3dd-4820-bf61-8f4bb40815da"
The server is running at http://127.0.0.1:3000/
Note It's also possible to run the server via Docker.
Once the server is up, you can use either dstack
's CLI or API to run workloads.
Below is a live demo of how it works with the CLI.
Dev environments allow you to quickly provision a machine with a pre-configured environment, resources, IDE, code, etc.
Tasks are perfect for scheduling all kinds of jobs (e.g., training, fine-tuning, processing data, batch inference, etc.) as well as running web applications.
Services make it very easy to deploy any model or web application as a public endpoint.
For additional information and examples, see the following links: