/ResNet-STFT-SSL

ResNet-STFT Model for Sound Source Localization

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

ResNet-STFT-SSL

ResNet-STFT Model for Sound Source Localization

Unofficial PyTorch implementation of He's: Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation

Overview of ResNet-STFT

Dependency

  • PyTorch <https://pytorch.org/>
  • apkit <https://github.com/hwp/apkit>_ (version 0.2)

Data

We use the SSLR dataset <https://www.idiap.ch/dataset/sslr>_ for the experiments.

Usage

  1. Run ./qsub/gen_data_frame_level.sh to extract features, then write them and the corresponding label into pickle file
  2. Run ./qsub/train_with_CNN-STFT.sh and that's it. (If you don't want to choose the two-stage training strategy, you may Run ./qsub/train_with_CNN-STFT-wo2stage.sh and that's it.)