/openseq2seq

Fork from original openseq2seq

Primary LanguagePythonApache License 2.0Apache-2.0

License Documentation

OpenSeq2Seq

Forked OpenSeq2Seq

Adopting for usage of wav2vec features produced by fairseq-library

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Acknowledgments

NVIDIA Openseq2seq Pytorch Fairseq

Usage

  • Use Fairseq Library to train a wav2vec model
  • Use wav2vec model to featurize audio-files
  • put wav2vec-files (.h5context file extension) in a folder called 'wav2vec_files' next to a folder containing original audio-files called 'wav_files'
  • adjust your openseq2seq-config-file according to next section:
train_params = {
    "data_layer": Speech2TextDataLayer,
    "data_layer_params": {
        "cache_features": True,
        "cache_regenerate": False,
        "cache_format": "wav2vec",
        "num_audio_features": 512, #irrelevant but corrected
        ...
    },
}