A lightweight package for fast, GPU-accelerated computation of gradients and Hessians of functions constructed via composition.
Deep neural networks (DNNs) and other composition-based models have become a staple of data science, garnering state-of-the-art results and gaining widespread use in the scientific community, particularly as surrogate models to replace expensive computations. The unrivaled universality and success of DNNs is due, in part, to the convenience of automatic differentiation (AD) which enables users to compute derivatives of complex functions without an explicit formula. Despite being a powerful tool to compute first-order derivatives (gradients), AD encounters computational obstacles when computing second-order derivatives (Hessians).
Knowledge of second-order derivatives is paramount in many growing fields and can provide insight into the optimization problem solved to build a good model. Hessians are notoriously challenging to compute efficiently with AD and cumbersome to derive and debug analytically. Hence, many algorithms approximate Hessian information, resulting in suboptimal performance. To address these challenges, hessQuik
computes Hessians analytically and efficiently with an implementation that is accelerated on GPUs.
For package usage and details, see our paper in the Journal of Open Source Software.
For detailed documentation, visit https://hessquik.readthedocs.io/.
From PyPI:
pip install hessQuik
From Github:
python -m pip install git+https://github.com/elizabethnewman/hessQuik.git
These dependencies are installed automatically with pip
.
- torch (recommended version >= 1.10.0, but code will run with version >= 1.5.0)
Once you have installed hessQuik, you can import as follows:
import hessQuik.activations as act
import hessQuik.layers as lay
import hessQuik.networks as net
You can construct a hessQuik network from layers as follows:
d = 10 # dimension of the input features
widths = [32, 64] # hidden channel dimensions
f = net.NN(lay.singleLayer(d, widths[0], act=act.antiTanhActivation()),
lay.resnetLayer(widths[0], h=1.0, act=act.softplusActivation()),
lay.singleLayer(widths[0], widths[1], act=act.quadraticActivation())
)
You can obtain gradients and Hessians via
nex = 20 # number of examples
x = torch.randn(nex, d)
fx, dfx, d2fx = f(x, do_gradient=True, do_Hessian=True)
If you only require Laplacians, not full Hessians, you can obtain the gradients and Laplacians via
fx, dfx, lapfd2x = f(x, do_gradient=True, do_Laplacian=True)
If you only require evaluations of the Jacobian and Hessian along certain directions, you can provide the directions in forward_mode
via
k = 3 # number of directions
v = torch.randn(k, d)
fx, vdfx, vd2fxv = f(x, do_gradient=True, do_Hessian=True, v=v, forward_mode=True)
and in backward_mode
via
m = widths[-1] # dimension of output features
v = torch.randn(m, k)
fx, dfxv, d2fxv = f(x, do_gradient=True, do_Hessian=True, v=v, forward_mode=False)
Some important notes:
- Currently, this functionality is only supported for
singleLayer
,resnetLayer
, and networks using only these types of layers, includingfullyConnectedNN
andresnetNN
. - If
do_Hessian=True
, then the full Hessian will be computed, even ifdo_Laplacian=True
as well. - Laplacians can only be computed in forward mode. Hence, if
do_Laplacian=True
and full Hessians are not requested,hessQuik
will compute derivatives withforward_mode=True
automatically. - For evaluating of derivatives along certain directions, the user must specify the mode of differentiation. Currently, this choice is not automated.
To make the code accessible, we provide some introductory Google Colaboratory notebooks.
Practical Use: Hermite Interpolation
Tutorial: Constructing and Testing hessQuik
Layers
To contribute to hessQuik
, follow these steps:
- Fork the
hessQuik
repository - Clone your fork using
git clone https://github.com/<username>/hessQuik.git
- Contribute to your forked repository
- Create a pull request
If your code passes the necessary numerical tests and is well-documented, your changes and/or additions will be merged in the main hessQuik
repository. You can find examples of the tests used in each file and related unit tests the tests
directory.
If you notice an issue with this repository, please report it using Github Issues. When reporting an implementation bug, include a small example that helps to reproduce the error. The issue will be addressed as quickly as possible.
@article{Newman2022,
doi = {10.21105/joss.04171},
url = {https://doi.org/10.21105/joss.04171},
year = {2022},
publisher = {The Open Journal},
volume = {7},
number = {72},
pages = {4171},
author = {Elizabeth Newman and Lars Ruthotto},
title = {`hessQuik`: Fast Hessian computation of composite functions},
journal = {Journal of Open Source Software}
}
This material is in part based upon work supported by the US National Science Foundation under Grant Number 1751636, the Air Force Office of Scientific Research Award FA9550-20-1-0372, and the US DOE Office of Advanced Scientific Computing Research Field Work Proposal 20-023231. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.