auraloss
A collection of audio-focused loss functions in PyTorch. [PDF]
Setup
pip install git+https://github.com/csteinmetz1/auraloss
Usage
import torch
import auraloss
mrstft = auraloss.freq.MultiResolutionSTFTLoss()
input = torch.rand(8,1,44100)
target = torch.rand(8,1,44100)
loss = mrstft(input, target)
Loss functions
We categorize the loss functions as either time-domain or frequency-domain approaches. Additionally, we include perceptual transforms.
Loss function | Interface | Reference |
---|---|---|
Time domain | ||
Error-to-signal ratio (ESR) | auraloss.time.ESRLoss() |
Wright & Välimäki, 2019 |
DC error (DC) | auraloss.time.DCLoss() |
Wright & Välimäki, 2019 |
Log hyperbolic cosine (Log-cosh) | auraloss.time.LogCoshLoss() |
Chen et al., 2019 |
Signal-to-distortion ratio (SDR) | auraloss.time.SDRLoss() |
Vincent et al., 2006 |
Scale-invariant signal-to-distortion ratio (SI-SDR) |
auraloss.time.SISDRLoss() |
Le Roux et al., 2018 |
Frequency domain | ||
Spectral convergence | auraloss.freq.SpectralConvergenceLoss() |
Arik et al., 2018 |
Log STFT magnitude | auraloss.freq.LogSTFTMagnitudeLoss() |
Arik et al., 2018 |
Aggregate STFT | auraloss.freq.STFTLoss() |
Arik et al., 2018 |
Multi-resolution STFT | auraloss.freq.MultiResolutionSTFTLoss() |
Yamamoto et al., 2019 |
Random-resolution STFT | auraloss.freq.RandomResolutionSTFTLoss() |
Steinmetz & Reiss, 2020 |
Sum and difference STFT loss | auraloss.freq.SumAndDifferenceSTFTLoss() |
Steinmetz et al., 2020 |
Perceptual transforms | ||
Sum and difference signal trasform | auraloss.perceptual.SumAndDifference() |
|
FIR pre-emphasis filters | auraloss.perceptual.FIRFilter() |
Wright & Välimäki, 2019 |
Examples
Currently we include an example using a set of the loss functions to train a TCN for modeling an analog dynamic range compressor.
For details please refer to the details in examples/compressor
.
We provide pre-trained models, evaluation scripts to compute the metrics in the paper, as well as scripts to retrain models.
Development
Note that a few losses have yet to be implemented (SDR, SI-SDR), but they will be coming soon. Additionally, we currently have no tests, but those will also be coming soon, so use caution at the moment. Future loss functions to be included will target neural network based perceptual losses, which tend to be a bit more sophisticated than those we have included so far.
If you are interested in adding a loss function please make a pull request.
Cite
If you use this code in your work please consider citing us.
@inproceedings{steinmetz2020auraloss,
title={auraloss: {A}udio focused loss functions in {PyTorch}},
author={Steinmetz, Christian J. and Reiss, Joshua D.},
booktitle={Digital Music Research Network One-day Workshop (DMRN+15)},
year={2020}}