/audio-word2vec-pytorch

Reproducing and Improving Audio Word2Vec

Primary LanguagePython

Reproducing Audio-Word2Vec

Sequence-to-sequence neural network. Try out the ToyDataset to understand how it works. Feed MFCC's instead to train Audio-Word2Vec.

Adapted from https://github.com/b-etienne/Seq2seq-PyTorch/ Check it out if you are looking for a good repo on Seq2Seq

Original papers

Getting Started

Prerequisites

Install the packages with pip

pip install -r requirements.txt

Train model

Train and evaluate models with

python main.py --config=<json_config_file>

Examples of config files are given in the "experiments" folder. All config files have to be placed in this directory.

Hyper-parameters

You can tune the following parameters:

  • decoder type (with or without Attention)
  • encoder type (with or without downsampling, with or without preprocessing layers)
  • the encoder's hidden dimension
  • the number of recurrent layers in the encoder
  • the encoder dropout
  • the bidirectionality of the encoder
  • the decoder's hidden dimension
  • the number of recurrent layers in the decoder
  • the decoder dropout
  • the bidirectionality of the decoder