Repository of recipes for the JSALT2019 workshop on "Speaker Detection in Adverse Scenarios with a Single Microphone"
- To clone the repo execute
git clone --recursive https://github.com/jsalt2019-diadet/jsalt2019-diadet.git
-
The recursive option downloads some dependencies:
- hyperion: python code for spk detection back-end
-
If you want to update the sumodules to the last commit, run
cd jsalt2019-diadet
git submodule sync
git submodule update --init --recursive --remote
- Dependencies are downloaded in
jsalt2019-diadet/tools
-
The recipes also depend on Anaconda3.5, Kaldi, cuDNN, etc.
-
Recommended: use some preinstalled versions of the dependencies in the grid to avoid each person having its own.
- To create links to preinstalled kaldi, anaconda and cudnn, run:
cd jsalt2019-diadet/ ./make_clsp_links.sh
- The anaconda that you will link with this has several environments:
- base: numpy, h5py, pandas, etc.
- tensorflow1.8g_cpu: tensorflow 1.8 for cpu
- tensorflow1.8g_gpu: tensorflow 1.8 for gpu
- pytorch1.0_cuda9.0: pytorch 1.0 with cuda 9.0
- pyannote: python3.6 with pyannote-metrics installed.
-
Anaconda3.5:
- Make a link to your anaconda installation in the tools directory:
cd jsalt2019-diadet/tools/anaconda ln -s <your-anaconda-3.5> anaconda3.5
- or follow instructions in jsalt2019-diadet/tools/anaconda/full_install.sh to install anaconda from scratch
-
Kaldi speech recognition toolkit
- Make link to an existing kaldi installation
cd jsalt2019-diadet/tools/kaldi ln -s <your-kaldi> kaldi
- or follow instructions in jsalt2019-diadet/tools/anaconda/install_kaldi.sh to install kaldi from scratch
-
CuDNN: tensorflow and pytorch will need some version of cudnn
- Make a link to some existing cudnn version that matches the requirements of your tf or pytorch, e.g.:
cd jsalt2019-diadet/tools/cudnn #cudnn v7.4 for cuda 9.0 needed by pytorch 1.0 ln -s /home/janto/usr/local/cudnn-9.0-v7.4 cudnn-9.0-v7.4
- The directory structure of the repo looks like this:
./jsalt2019-diadet
./jsalt2019-diadet/tools
./jsalt2019-diadet/tools/anaconda
./jsalt2019-diadet/tools/anaconda/anaconda3
./jsalt2019-diadet/tools/cudnn
./jsalt2019-diadet/tools/cudnn/cudnn-9.0-v7.4
./jsalt2019-diadet/tools/kaldi
./jsalt2019-diadet/tools/kaldi/kaldi
./jsalt2019-diadet/tools/hyperion
./jsalt2019-diadet/tools/hyperion/hyperion
./jsalt2019-diadet/tools/speech_denoising_tools
./jsalt2019-diadet/egs
./jsalt2019-diadet/egs/jsalt2019-diadet
./jsalt2019-diadet/egs/jsalt2019-diadet/v1
./jsalt2019-diadet/src
- Directories:
- tools: contains external repos and tools like kaldi, python, pyannotate, hyperion, cudnn, etc.
- src: it can be used to place code that we create specifically for this repo.
- src/kaldi_augmentation: some scripts to perform data augmentation using the wav-reverberate kaldi tool
- egs: contains the recipes
- egs/jsalt2019-diadet: recipe for speaker diarization/detection/tracking for all datasets that we use in the workshop.
- v1: Version 1 is based on kaldi x-vectors
- egs/sitw_noisy: recipe for SITW with added noise and reverberation in the dev/eval test. Used to measure performance of enhancement methods at different noise types, noise levels, RT60 reveration times.
- v1: Based on kaldi x-vectors.
- egs/jsalt2019-diadet: recipe for speaker diarization/detection/tracking for all datasets that we use in the workshop.