/darpa-rats-vad

GMM-based frame level VAD on DARPA Rats corpus

Primary LanguageShell

GMM-based Speech Activity Detection

The code can also be found in the github repository at https://github.com/langep/darpa-rats-vad.

The original data

This experiment is based of DARPA RATS corpus. Access has been granted by my employer but I cannot share the data. I have shared the extracted features and indicated below which scripts can be run with the code.

Directory structre

  • conf/: feature extraction related configuration
  • ground_truth/: the ground truth labels for the test set utterances
  • local/: copy from wsj steps and other egs
  • scores/: the results from scoring the decoding output against ground truth labels
  • sid/: copy from egs/sre08/v1/
  • steps/: copy from egs/sre08/v1/
  • utils/: egs/sre08/v1/
  • ./corpus-description.txt: contains the original corpus description
  • S/: contains speech class training data directories, one for each channel
  • NS/: contains non-speech class training data directories, one for each channel
  • NT/: contains non-transitted class training data directories, one for each channel
  • test/: contains test datadirectories, one for each channel
  • exp/: contains trained models and decoding output
  • mfcc/: contains mfcc features

Contribution

  • scripts in scripts/ have been written by me
  • sid/compute_vad_decisions_gmm.sh has been modified to write results in text form by adding 't,' in front of the wspecifier of the results

Running the experiment

Important starting information. Either run scripts/initial_setup.sh or you need to modify path.sh to point to a kaldi installation manually. Also, you need to symlink steps and utils from egs/sree08/v1 if you don't run scripts/initial_setup.sh.

Setup kaldi location, data location, create symlinks, etc. !NOTE: You can enter any directory for if you don't have the darpa rats data. Be aware to not run any scripts that are marked as requiring the original data.

bash scripts/initial_setup.sh <path-to-kaldi> <path-to-rats-data>

Segment training audio and prepare for experiment !Requires original data. Output provided.

bash scripts/prepare_train.sh

Create features for training !Requires original data. Output provided.

bash scripts/make_train_set_features.sh

Create vad decisions for training

bash scripts/make_train_set_vad_decisions.sh

Train channel specific GMM models

bash scripts/train_diag_ubms.sh
bash scripts/train_full_ubms.sh

Train a single model using all channels

bash train_combined_model.sh 1
# Manually delete C_S.6.093 from S/all/wav.scp and S/all/utt2spk because there are no feats generated for it.
bash scripts/train_combined_model.sh 2

Prepare test data, features, etc. !Requires original data. Output provided

bash scripts/prepare_test.sh

Run the decoding

bash scripts/decode.sh

Produce the scores

bash scripts/create_scores.sh

The scores can now be found in scores/