torchlpc provides a PyTorch implementation of the Linear Predictive Coding (LPC) filter, also known as all-pole filter.
It's fast, differentiable, and supports batched inputs with time-varying filter coefficients.
Given an input signal $\mathbf{x} \in \mathbb{R}^T$ and time-varying LPC coefficients $\mathbf{A} \in \mathbb{R}^{T \times N}$ with an order of $N$, the LPC filter is defined as:
$$
y_t = x_t - \sum_{i=1}^N A_{t,i} y_{t-i}.
$$
Usage
importtorchfromtorchlpcimportsample_wise_lpc# Create a batch of 10 signals, each with 100 time stepsx=torch.randn(10, 100)
# Create a batch of 10 sets of LPC coefficients, each with 100 time steps and an order of 3A=torch.randn(10, 100, 3)
# Apply LPC filteringy=sample_wise_lpc(x, A)
# Optionally, you can provide initial values for the output signal (default is 0)zi=torch.randn(10, 3)
y=sample_wise_lpc(x, A, zi=zi)
The details of the derivation can be found in our preprints12.
We show that, given the instataneous gradient $\frac{\partial \mathcal{L}}{\partial y_t}$ where $\mathcal{L}$ is the loss function, the gradients of the LPC filter with respect to the input signal $\bf x$ and the filter coefficients $\bf A$ can be expresssed also through a time-varying filter:
Gradients for the initial condition $y_t|_{t \leq 0}$
The initial conditions provide an entry point at $t=1$ for filtering, as we cannot evaluate $t=-\infty$.
Let us assume $A_{t, :}|_{t \leq 0} = 0$ so $y_t|_{t \leq 0} = x_t|_{t \leq 0}$, which also means $\frac{\partial \mathcal{L}}{\partial y_t}|_{t \leq 0} = \frac{\partial \mathcal{L}}{\partial x_t}|_{t \leq 0}$.
Thus, the initial condition gradients are
In practice, we pad $N$ and $N \times N$ zeros to the beginning of $\frac{\partial \mathcal{L}}{\partial \bf y}$ and $\mathbf{A}$ before evaluating $\frac{\partial \mathcal{L}}{\partial \bf x}$.
The first $N$ outputs are the gradients to $y_t|_{t \leq 0}$ and the rest are to $x_t|_{t > 0}$.
Time-invariant filtering
In the time-invariant setting, $A_{t, i} = A_{1, i} \forall t \in [1, T]$ and the filter is simplified to
The gradients $\frac{\partial \mathcal{L}}{\partial \mathbf{x}}$ are filtering $\frac{\partial \mathcal{L}}{\partial \mathbf{y}}$ with $\mathbf{a}$ backwards in time, same as in the time-varying case.
$\frac{\partial \mathcal{L}}{\partial \mathbf{a}}$ is simply doing a vector-matrix multiplication:
This algorithm is more efficient than 3 because it only needs one pass of filtering to get the two gradients while the latter needs two.
TODO
Use PyTorch C++ extension for faster computation.
Use native CUDA kernels for GPU computation.
Add examples.
Related Projects
torchcomp: differentiable compressors that use torchlpc for differentiable backpropagation.
jaxpole: equivalent implementation in JAX by @rodrigodzf.
Citation
If you find this repository useful in your research, please cite our work with the following BibTex entries:
@inproceedings{ycy2024diffapf,
title={Differentiable All-pole Filters for Time-varying Audio Systems},
author={Chin-Yun Yu and Christopher Mitcheltree and Alistair Carson and Stefan Bilbao and Joshua D. Reiss and György Fazekas},
booktitle={International Conference on Digital Audio Effects (DAFx)},
year={2024},
pages={345--352},
}
@inproceedings{ycy2024golf,
title = {Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis},
author = {Chin-Yun Yu and György Fazekas},
year = {2024},
booktitle = {Proc. Interspeech},
pages = {1820--1824},
doi = {10.21437/Interspeech.2024-1187},
}