/Dilated-Convolution-with-Learnable-Spacings-PyTorch

[ICLR 2023] "Dilated convolution with learnable spacings" Ismail Khalfaoui Hassani, Thomas Pellegrini and Timothée Masquelier

Primary LanguageJupyter NotebookMIT LicenseMIT

arXiv arXiv medium colab

Dilated-Convolution-with-Learnable-Spacings-PyTorch

This is an official implementation of Dilated Convolution with Learnable Spacings by Ismail Khalfaoui Hassani, Thomas Pellegrini and Timothée Masquelier.

Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a novel convolution method based on gradient descent and interpolation. It could be seen as an improvement of the well known dilated convolution that has been widely explored in deep convolutional neural networks and which aims to inflate the convolutional kernel by inserting spaces between the kernel elements.

In DCLS, the positions of the weights within the convolutional kernel are learned in a gradient-based manner, and the inherent problem of non-differentiability due to the integer nature of the positions in the kernel is solved by taking advantage of an interpolation method.

For now, the code has only been implemented on PyTorch, using Pytorch.

The method is described in the article Dilated Convolution with Learnable Spacings. The Gaussian and triangle versions are described in the arXiv preprint Dilated Convolution with Learnable Spacings: beyond bilinear interpolation.

What's new

Dec 22, 2023:

  • Dcls2d could already be used with flat dilated kernel sizes ((7,1) for example). However, this introduces unnecessary position and sigma learning along the flat dimension. We introduce Dcls2dK1d where a 1D flat kernel is constructed but a 2D convolution is applied. Please see the Usage section for a use case.

Oct 19, 2023:

  • A new family of DCLS methods is implemented: DclsN_Md (N for the convolution dimension and M for the number of N learnable position dimensions). Currently, only the Dcls3_1d method is available. Please see the Usage section for a use case.

    Sep 28, 2023:

  • 🚀 🚀 A new repository for audio classification on AudioSet using DCLS and with state-of-the-art vision models adapted to audio spectrograms. Please checkout the git repo DCLS Audio and / or the paper Audio classification with Dilated Convolution with Learnable Spacings arXiv. Models checkpoints are available !

Sep 22, 2023:

Previous news

Jun 16, 2023:

Jun 2, 2023:

  • New DCLS version supports Gaussian and triangle interpolations in addition to previous bilinear interpolation. To use it, please do:
pip3 install --upgrade --force-reinstall dcls

or recompile after a git update.

import torch
from DCLS.construct.modules import  Dcls2d

# Dcls2d with Gaussian interpolation. available versions : ["gauss", "max", "v1", "v0"]
m = Dcls2d(96, 96, kernel_count=26, dilated_kernel_size=17, padding=8, groups=96, version="gauss")
input = torch.randn(20, 96, 50, 100)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad, m.SIG.grad)

Apr 16, 2023:

  • Fix an important bug in Dcls1d version. Please reinstall the pip wheel via
pip3 install --upgrade --force-reinstall dcls

or recompile after a git update.

Jan 7, 2023:

  • Important modification to ConstructKernel{1,2,3}d algorithm which allows to use less memory, this modification enables very large kernel counts. For example:
from DCLS.construct.modules import  Dcls2d

m = Dcls2d(96, 96, kernel_count=2000, dilated_kernel_size=7, padding=3, groups=96).cuda() 

After installation of the new version 0.0.3 of DCLS, the use remains unchanged.

Nov 8, 2022:

  • Previous branch main is moved to branch cuda, now in main branch we have fully native torch conv{1,2,3}d.

Sep 27, 2022:

Installation

DCLS is based on PyTorch and CUDA. Please make sure that you have installed all the requirements before you install DCLS.

Requirements:

  • Pytorch version torch>=1.6.0. See torch.

Preferred versions:

pip3 install torch==1.8.0+cu111 torchvision==0.9.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html

Install the latest developing version from the source codes:

From GitHub:

git clone https://github.com/K-H-Ismail/Dilated-Convolution-with-Learnable-Spacings-PyTorch.git
cd Dilated-Convolution-with-Learnable-Spacings-PyTorch
python3 -m pip install --upgrade pip
python3 -m build 
python3 -m pip install dist/dcls-0.1.1-py3-none-any.whl 

Install the last stable version from PyPI:

pip3 install dcls

Usage

Dcls methods could be easily used as a substitue of Pytorch's nn.Convnd classical convolution method:

import torch
from DCLS.construct.modules import  Dcls2d

# With square kernels, equal stride and dilation
m = Dcls2d(16, 33, kernel_count=3, dilated_kernel_size=7)
input = torch.randn(20, 16, 50, 100)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)

A typical use is with the separable convolution

import torch
from DCLS.construct.modules import  Dcls2d

m = Dcls2d(96, 96, kernel_count=34, dilated_kernel_size=17, padding=8, groups=96)
input = torch.randn(128, 96, 56, 56)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)

Dcls with different dimensions

import torch
from DCLS.construct.modules import  Dcls1d 

# Will construct kernels of size 7x7 with 3 elements inside each kernel
m = Dcls1d(3, 16, kernel_count=3, dilated_kernel_size=7)
input = torch.rand(8, 3, 32)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)
import torch
from DCLS.construct.modules import  Dcls3d

m = Dcls3d(16, 33, kernel_count=10, dilated_kernel_size=(7,8,9))
input = torch.randn(20, 16, 50, 100, 30)
output = m(input)
loss = output.sum()
loss.backward()
print(output, m.weight.grad, m.P.grad)

DepthWiseConv2dImplicitGEMM for 2D-DCLS:

For 2D-DCLS, to install and enable the DepthWiseConv2dImplicitGEMM, please follow the instructions of RepLKNet. Otherwise, Pytorch's native Conv2D method will be used.

Dcls could also be used for kernels where position learning is restricted to a subset of dimensions chosen from the N kernel dimensions. A generic class called DclsN_Md (N for the convolution dimension and M for the number of N learnable position dimensions). For now, only the Dcls3_1d method is available and could be used as follows:

import torch
from DCLS.construct.modules import  Dcls3_1d

m = Dcls3_1d(
    out_channels=32,
    in_channels=32,
    kernel_count=26,
    dilated_kernel_size=11, # the dimension along which positions are learned
    dense_kernel_size=(3, 3), # no learnable positions in these 2 dims
    groups=1,
    padding=(1, 1, 11 // 2),
)
# The last dimension of the input is always where positions are learned
input = torch.randn(8, 32, 11, 11, 31)
output = m(input)
loss = output.sum()
loss.backward()
print(output.size(), m.weight.grad.size(), m.P.grad.size())

As for flat kernels, DclsNdKMd could be used. For now, only the Dcls2dK1d method is available and could be used as follows:

import torch
from DCLS.construct.modules import  Dcls2dK1d

m = Dcls2dK1d(
    out_channels=32,
    in_channels=32,
    kernel_count=3,
    dilated_kernel_size=11,
    flat_dim=0 # the flat dimensions dimension, here it is equivalent to torch.nn.Conv2d with kernel_size=(1,11)
    groups=1,
    padding=(0, 11 // 2),

)
# The last dimension of the input is always where positions are learned
input = torch.randn(8, 32, 56, 56)
output = m(input)
loss = output.sum()
loss.backward()
print(output.size(), m.weight.grad.size(), m.P.grad.size())

Device Supports

DCLS supports CPU and Nvidia CUDA GPU devices now.

  • Nvidia GPU
  • CPU

Make sure to have your data and model on CUDA GPU.

Publications and Citation

If you use DCLS in your work, please consider to cite it as follows:

@inproceedings{
hassani2023dilated,
title={Dilated convolution with learnable spacings},
author={Ismail Khalfaoui-Hassani and Thomas Pellegrini and Timoth{\'e}e Masquelier},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=Q3-1vRh3HOA}
}

If you use DCLS with Gaussian or triangle interpolations in your work, please consider to cite as well:

@inproceedings{
khalfaoui-hassani2023dilated,
title={Dilated Convolution with Learnable Spacings: beyond bilinear interpolation},
author={Ismail Khalfaoui-Hassani and Thomas Pellegrini and Timoth{\'e}e Masquelier},
booktitle={ICML 2023 Workshop on Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators},
year={2023},
url={https://openreview.net/forum?id=j8FPBCltB9}
}

Contribution

This project is open source, therefore all your contributions are welcomed, whether it's reporting issues, finding and fixing bugs, requesting new features, and sending pull requests ...