/plinio

A Plug-and-play Lightweight tool for the Inference Optimization of Deep Neural networks

Primary LanguagePythonApache License 2.0Apache-2.0

Logo partially generated with Stable Diffusion.

PLiNIO is a Python package based on PyTorch that provides a Plug-and-play Lightweight tool for Deep Neural networks (DNNs) Inference Optimization. It allows you to automatically optimize a DNN architecture adding 3 lines of code to your standard PyTorch training loop.

PLiNIO uses gradient-based optimization algorithms, such as Differentiable Neural Architecture Search (DNAS) and Differentiable Mixed-Precision Search (DMPS) to keep search costs low.

Reference

If you use PLiNIO to optimize your model, please acknowledge our paper: https://arxiv.org/abs/2307.09488 :

@misc{plinio,
      title={PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization},
      author={D. {Jahier Pagliari} and M. {Risso} and B. A. {Motetti} and A. {Burrello}},
      year={2023},
      eprint={2307.09488},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Optimization Methods

At the current state, the following optimization strategies are supported:

  • SuperNet, a coarse-grained DNAS for layer selection inspired by DARTS.

  • PIT, a fine-grained DNAS for layer geometry optimization (channel pruning, filter size pruning, dilation increase).

  • MPS, a differentiable Mixed-Precision Search algorithm which extends EdMIPS to support channel-wise precision optimization and joint pruning and MPS. Moreover, when MPS is applied with a single precision choice, it can be used to implement a standard Quantization-Aware Training (QAT).

  • ODiMO MPS, an implementation of the One-shot Differentiable Mapping Optimizer (ODiMO) concept, which transforms the problem of deploying a DNN onto multiple accelerators, supporting incompatible data representations, into a Differentiable MPS. More details in our paper.

In the code snippet above, Method() should be replaced with one of the supported optimization methods' names. More information on each optimization can be found in the dedicated pages.

Hardware Cost Models

PLiNIO focuses on hardware-awareness and accurate cost modeling. Its main use case is finding DNNs that are not only accurate, but also efficient in terms of one or more cost metrics, or that respect user-defined cost constraints. Besides common hardware-independent cost metrics (n. of parameters and n. of OPs per inference), PLiNIO also provides more advanced models that account for specific HW platforms' spatial parallelism, dataflow, etc.

Both generic and HW-specific cost models are defined in the plinio.cost sub-package, and the library is designed to easily allow users to extend it with custom models for their hardware. More information can be found here.

Regularizers

The simplest way to implement a cost-aware DNN optimization in PLiNIO consists in adding a cost term to the loss function of your PyTorch training loop, as shown at the top of this page. The cost term is most often multiplied times a scalar regularization strength:

# before:
# loss = criterion(output, target)
# after:
loss = criterion(output, target) + reg_strength * model.cost

However, PLiNIO also supports more advanced regularizers, which are useful in scenarios such as:

  • The necessity to co-optimize for multiple cost metrics (such as number of parameters and OPs).
  • The necessity to consider cost metrics as constraints rather than secondary optimization objectives.

To this end, PLiNIO provides the regularizers sub-package, which enables this more general format:

# before:
# loss = criterion(output, target)
# after:
loss = criterion(output, target) + regularizer(model)

More information about the available regularizers can be found here.

Installation

To install the latest release (with pip):

$ git clone https://github.com/eml-eda/plinio
$ cd plinio
$ pip install -r requirements.txt
$ python setup.py install

Example Script

TBD

Publications based on PLiNIO

Below is a list of links to publications that either describe elements of PLiNIO, or use it to optimize DNNs for specific applications:

Main paper describing the library:

Papers on novel NAS techniques included in PLiNIO:

Usages of PLiNIO to optimize DNNs for various applications:

License

PLiNIO entire codebase is released under Apache License 2.0.