/SincNetConv

A PyTorch 1.0 implementation of the convolutions described in SincNet

Primary LanguagePythonMIT LicenseMIT

SincConv

A PyTorch 1.0 implementation of the bandpass convolutions described in Interpretable Convolutional Filters with SincNet. Compared with normal convolution, this has the following practical benefits for audio-domain models such as speaker or phoneme recognition:

  • Fewer parameters
  • Faster convergence
  • Intepretable filters
  • Better performance

Adapted from the official implementation at: https://github.com/mravanelli/SincNet/. Compared to the original implementation, the filter bank construction has been parallelised. Additionally, padding has been added to preserve the length / time dimension of the input audio.

Authors

If you use this code or part of it, please cite the original paper authors!

Mirco Ravanelli, Yoshua Bengio, “Speaker Recognition from raw waveform with SincNet” Arxiv