This is an official implementation of Dilated Convolution with Learnable Spacings by Ismail Khalfaoui Hassani, Thomas Pellegrini and Timothée Masquelier.
Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a novel convolution method based on gradient descent and interpolation. It could be seen as an improvement of the well known dilated convolution that has been widely explored in deep convolutional neural networks and which aims to inflate the convolutional kernel by inserting spaces between the kernel elements.
In DCLS, the positions of the weights within the convolutional kernel are learned in a gradient-based manner, and the inherent problem of non-differentiability due to the integer nature of the positions in the kernel is solved by taking advantage of an interpolation method.
For now, the code has only been implemented on PyTorch, using Pytorch.
The method is described in the article Dilated Convolution with Learnable Spacings. The Gaussian and triangle versions are described in the arXiv preprint Dilated Convolution with Learnable Spacings: beyond bilinear interpolation.
Dec 22, 2023:
- Dcls2d could already be used with flat dilated kernel sizes ((7,1) for example). However, this introduces unnecessary position and sigma learning along the flat dimension. We introduce Dcls2dK1d where a 1D flat kernel is constructed but a 2D convolution is applied. Please see the Usage section for a use case.
Oct 19, 2023:
-
A new family of DCLS methods is implemented: DclsN_Md (N for the convolution dimension and M for the number of N learnable position dimensions). Currently, only the Dcls3_1d method is available. Please see the Usage section for a use case.
Sep 28, 2023:
-
🚀 🚀 A new repository for audio classification on AudioSet using DCLS and with state-of-the-art vision models adapted to audio spectrograms. Please checkout the git repo DCLS Audio and / or the paper Audio classification with Dilated Convolution with Learnable Spacings . Models checkpoints are available !
Sep 22, 2023:
- 🎉 🎉 The paper on DCLS Gaussian interpolation Dilated Convolution with Learnable Spacings: beyond bilinear interpolation has been published at the Differentiable Almost Everything Workshop of the 40th International Conference on Machine Learning [ICML2023].
Previous news
Jun 16, 2023:
- A new tutorial on how to use DCLS in vision backbones is now available: DCLS Vision Tutorial .
- A short blog post which summarizes the DCLS method has been published in Medium: What is Dilated Convolution with Learnable Spacings (DCLS) and how to use it ? .
Jun 2, 2023:
- New DCLS version supports Gaussian and triangle interpolations in addition to previous bilinear interpolation. To use it, please do:
pip3 install --upgrade --force-reinstall dcls
or recompile after a git update.
import torch
from DCLS.construct.modules import Dcls2d
# Dcls2d with Gaussian interpolation. available versions : ["gauss", "max", "v1", "v0"]
m = Dcls2d(96, 96, kernel_count=26, dilated_kernel_size=17, padding=8, groups=96, version="gauss")
input = torch.randn(20, 96, 50, 100)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad, m.SIG.grad)
- Learning techniques for this method are described in Dilated Convolution with Learnable Spacings: beyond bilinear interpolation.
Apr 16, 2023:
- Fix an important bug in Dcls1d version. Please reinstall the pip wheel via
pip3 install --upgrade --force-reinstall dcls
or recompile after a git update.
Jan 7, 2023:
- Important modification to ConstructKernel{1,2,3}d algorithm which allows to use less memory, this modification enables very large kernel counts. For example:
from DCLS.construct.modules import Dcls2d
m = Dcls2d(96, 96, kernel_count=2000, dilated_kernel_size=7, padding=3, groups=96).cuda()
After installation of the new version 0.0.3 of DCLS, the use remains unchanged.
Nov 8, 2022:
- Previous branch main is moved to branch cuda, now in main branch we have fully native torch conv{1,2,3}d.
Sep 27, 2022:
- Code release for ConvNeXt-dcls experiments. See ConvNeXt-dcls.
DCLS is based on PyTorch and CUDA. Please make sure that you have installed all the requirements before you install DCLS.
Requirements:
- Pytorch version torch>=1.6.0. See torch.
Preferred versions:
pip3 install torch==1.8.0+cu111 torchvision==0.9.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
Install the latest developing version from the source codes:
From GitHub:
git clone https://github.com/K-H-Ismail/Dilated-Convolution-with-Learnable-Spacings-PyTorch.git
cd Dilated-Convolution-with-Learnable-Spacings-PyTorch
python3 -m pip install --upgrade pip
python3 -m build
python3 -m pip install dist/dcls-0.1.1-py3-none-any.whl
Install the last stable version from PyPI:
pip3 install dcls
Dcls methods could be easily used as a substitue of Pytorch's nn.Convnd classical convolution method:
import torch
from DCLS.construct.modules import Dcls2d
# With square kernels, equal stride and dilation
m = Dcls2d(16, 33, kernel_count=3, dilated_kernel_size=7)
input = torch.randn(20, 16, 50, 100)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)
A typical use is with the separable convolution
import torch
from DCLS.construct.modules import Dcls2d
m = Dcls2d(96, 96, kernel_count=34, dilated_kernel_size=17, padding=8, groups=96)
input = torch.randn(128, 96, 56, 56)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)
Dcls with different dimensions
import torch
from DCLS.construct.modules import Dcls1d
# Will construct kernels of size 7x7 with 3 elements inside each kernel
m = Dcls1d(3, 16, kernel_count=3, dilated_kernel_size=7)
input = torch.rand(8, 3, 32)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)
import torch
from DCLS.construct.modules import Dcls3d
m = Dcls3d(16, 33, kernel_count=10, dilated_kernel_size=(7,8,9))
input = torch.randn(20, 16, 50, 100, 30)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)
DepthWiseConv2dImplicitGEMM for 2D-DCLS:
For 2D-DCLS, to install and enable the DepthWiseConv2dImplicitGEMM, please follow the instructions of RepLKNet. Otherwise, Pytorch's native Conv2D method will be used.
Dcls could also be used for kernels where position learning is restricted to a subset of dimensions chosen from the N kernel dimensions. A generic class called DclsN_Md (N for the convolution dimension and M for the number of N learnable position dimensions). For now, only the Dcls3_1d method is available and could be used as follows:
import torch
from DCLS.construct.modules import Dcls3_1d
m = Dcls3_1d(
out_channels=32,
in_channels=32,
kernel_count=26,
dilated_kernel_size=11, # the dimension along which positions are learned
dense_kernel_size=(3, 3), # no learnable positions in these 2 dims
groups=1,
padding=(1, 1, 11 // 2),
)
# The last dimension of the input is always where positions are learned
input = torch.randn(8, 32, 11, 11, 31)
output = m(input)
loss = output.sum()
loss.backward()
print(output.size(), m.weight.grad.size(), m.P.grad.size())
As for flat kernels, DclsNdKMd could be used. For now, only the Dcls2dK1d method is available and could be used as follows:
import torch
from DCLS.construct.modules import Dcls2dK1d
m = Dcls2dK1d(
out_channels=32,
in_channels=32,
kernel_count=3,
dilated_kernel_size=11,
flat_dim=0 # the flat dimensions dimension, here it is equivalent to torch.nn.Conv2d with kernel_size=(1,11)
groups=1,
padding=(0, 11 // 2),
)
# The last dimension of the input is always where positions are learned
input = torch.randn(8, 32, 56, 56)
output = m(input)
loss = output.sum()
loss.backward()
print(output.size(), m.weight.grad.size(), m.P.grad.size())
DCLS supports CPU and Nvidia CUDA GPU devices now.
- Nvidia GPU
- CPU
Make sure to have your data and model on CUDA GPU.
If you use DCLS in your work, please consider to cite it as follows:
@inproceedings{
hassani2023dilated,
title={Dilated convolution with learnable spacings},
author={Ismail Khalfaoui-Hassani and Thomas Pellegrini and Timoth{\'e}e Masquelier},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=Q3-1vRh3HOA}
}
If you use DCLS with Gaussian or triangle interpolations in your work, please consider to cite as well:
@inproceedings{
khalfaoui-hassani2023dilated,
title={Dilated Convolution with Learnable Spacings: beyond bilinear interpolation},
author={Ismail Khalfaoui-Hassani and Thomas Pellegrini and Timoth{\'e}e Masquelier},
booktitle={ICML 2023 Workshop on Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators},
year={2023},
url={https://openreview.net/forum?id=j8FPBCltB9}
}
This project is open source, therefore all your contributions are welcomed, whether it's reporting issues, finding and fixing bugs, requesting new features, and sending pull requests ...