burn-bench is a benchmarking repository for Burn. It helps
track performance across different hardware and software configurations, making it easier to
identify regressions, improvements, and the best backend for a given workload.
crates/backend-comparison/: Benchmarks for backend performance, ranging from individual tensor operations to full forward and backward passes for a given model.crates/burnbench/: The core benchmarking crate and CLI. Can be used as a standalone tool or integrated as a library to define and run custom benchmark suites.- (Future)
crates/integration-tests/: TBD. We'd like to add more tests to capture more complex workloads, including evaluation of model convergence, metrics, and overall training performance.
To run backend performance benchmarks, use the burnbench CLI:
cargo run --release --bin burnbench -- run --benches unary --backends wgpu-fusionOr use the shorthand alias:
cargo bb run -b unary -B wgpu-fusionThis will use the main branch of Burn by default.
To benchmark performance across version(s):
cargo bb run -b unary -B wgpu-fusion -V 0.18.0 main localYou can specify one or more versions and provide custom burnbench arguments to benchmark them.
The versions can be one of:
- Published version (e.g.,
0.18.0) - Git branch (e.g.,
main) - Git commit hash
local
By default, the local version points to a relative path for the Burn repo directory (../../burn
relative to backend-comparison/). This can be modified via the BURN_BENCH_BURN_DIR environment
variable.
For detailed instructions, see crates/burnbench/README.md and
crates/backend-comparison/README.md.
Burn supports sharing benchmark results to help users compare hardware and backend performance. Results are published at burn.dev/benchmarks.
To contribute benchmarks, authenticate using:
cargo run --release --bin burnbench -- authThen share results with:
cargo bb run --share --benches unary --backends wgpu-fusionTo develop burn-bench using your local development stack (including the benchmark server and website),
use the alias cargo bbd instead of cargo bb.
This alias builds burn-bench in debug mode and automatically points it to local endpoints.
You can trigger benchmark execution on-demand in a pull request by adding the label ci:benchmarks.
The parameters passed to burn-bench are defined in a benchmarks.toml file located at the root of the pull requestβs repository.
Below is an example of such a file. Most fields are self-explanatory:
[environment]
gcp_gpu_attached = true
gcp_image_family = "tracel-ci-ubuntu-2404-amd64-nvidia"
gcp_machine_type = "g2-standard-4"
gcp_zone = "us-east1-c"
repo_full = "tracel-ai/burn"
rust_toolchain = "stable"
rust_version = "stable"
[burn-bench]
backends = ["wgpu"]
benches = ["matmul"]
dtypes = ["f32"]The following diagram outlines the sequence of steps involved in executing benchmarks:
sequenceDiagram
actor Developer
participant PR as GitHub Pull Request
participant CI as Tracel CI Server
participant W as burn-bench Workflow
participant GCP as Google Cloud Platform
participant BB as burn-bench Runner
participant ORG as GitHub Organization
Developer->>PR: Add label "ci:benchmarks"
PR-->>CI: πͺ Webhook "labeled"
CI->>PR: π¬ "Benchmarks Status (enabled)" π’
CI->>PR: Read file "benchmarks.toml"
CI->>PR: π¬ Read file error if any (end of sequence) β
CI->>W: Dispatch "burn-bench" workflow
W-->>CI: πͺ Webhook "job queued"
CI->>GCP: π₯οΈ Provision GitHub runners
GCP->>BB: Spawn instances
BB->>ORG: Register runners
ORG->>W: Start workflow matrix job (one per machine type)
W->>W: Write temporary `inputs.json`
W->>BB: π₯ Execute benches with `inputs.json`
BB-->>CI: πͺ Webhook "started" (first machine only)
CI->>PR: π¬ "Benchmarks Started"
BB->>BB: Run benchmarks
BB-->>CI: πͺ Webhook "completed" (with data from `inputs.json`)
CI->>PR: π¬ "Benchmarks Completed" β
Note right of PR: End of sequence
Developer->>PR: Remove label "ci:benchmarks"
PR-->>CI: πͺ Webhook "unlabeled"
CI->>PR: π¬ "Benchmarks Status (disabled)" π΄
Note right of PR: End of sequence
Developer->>PR: Open pull request with "ci:benchmarks"
PR-->>CI: πͺ Webhook "opened"
CI->>PR: Start sequence at [Read file "benchmarks.toml"]
Note right of PR: End of sequence
Developer->>PR: Update code with π’
PR-->>CI: πͺ Webhook "synchronized"
CI->>PR: Restart sequence at [Read file "benchmarks.toml"]
Note right of PR: End of sequence
Developer->>PR: Merge pull request into main with π’
PR-->>CI: πͺ Webhook "closed"
CI->>PR: Start sequence at [Read file "benchmarks.toml"] without the π¬ tasks
Note right of PR: End of sequence
You can also manually execute the [benchmarks.yml workflow][] via the GitHub Actions UI.
When triggering it manually, youβll need to fill in the required input fields. Each field includes a default value, making them self-explanatory.
We welcome contributions to improve benchmarking coverage and add new performance tests.