Brevitas

Brevitas is a PyTorch library for neural network quantization, with support for both post-training quantization (PTQ) and quantization-aware training (QAT).

Please note that Brevitas is a research project and not an official Xilinx product.

If you like this project please consider ⭐ this repo, as it is the simplest and best way to support it.

Requirements

Python >= 3.8 .
Pytorch >= 1.9.1, <= 2.1 (more recent versions would be untested).
Windows, Linux or macOS.
GPU training-time acceleration (Optional but recommended).

Installation

You can install the latest release from PyPI:

pip install brevitas

Getting Started

Brevitas currently offers quantized implementations of the most common PyTorch layers used in DNN under brevitas.nn, such as QuantConv1d, QuantConv2d, QuantConvTranspose1d, QuantConvTranspose2d, QuantMultiheadAttention, QuantRNN, QuantLSTM etc., for adoption within PTQ and/or QAT. For each one of these layers, quantization of different tensors (inputs, weights, bias, outputs, etc) can be individually tuned according to a wide range of quantization settings.

As a reference for PTQ, Brevitas provides an example user flow for ImageNet classification models under brevitas_examples.imagenet_classification.ptq that quantizes an input torchvision model using PTQ under different quantization configurations (e.g. bit-width, granularity of scale, etc).

For more info, checkout https://xilinx.github.io/brevitas/getting_started .

Cite as

If you adopt Brevitas in your work, please cite it as:

@software{brevitas,
  author       = {Alessandro Pappalardo},
  title        = {Xilinx/brevitas},
  year         = {2023},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3333552},
  url          = {https://doi.org/10.5281/zenodo.3333552}
}

History