/wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Primary LanguagePythonOtherNOASSERTION

wav2letter++

CircleCI Join the chat at https://gitter.im/wav2letter/community

wav2letter++ is a highly efficient end-to-end automatic speech recognition (ASR) toolkit written entirely in C++, leveraging ArrayFire and flashlight.

The toolkit started from models predicting letters directly from the raw waveform, and now evolved as an all-purpose end-to-end ASR research toolkit, supporting a wide range of models and learning techniques. It also embarks a very efficient modular beam-search decoder, for both structured learning (CTC, ASG) and seq2seq approaches.

Important disclaimer: as a number of models from this repository could be used for other modalities, we moved most of the code to flashlight.

This repository includes recipes to reproduce the following research papers as well as pre-trained models:

Data preparation for our training and evaluation can be found in data folder.

The previous iteration of wav2letter can be found in the:

Build recipes

First, isntall flashlight with all its dependencies. Then

mkdir build && cd build && cmake .. && make -j8

If flashlight or ArrayFire are installed in nonstandard paths via CMAKE_INSTALL_PREFIX, they can be found by passing -Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake when running cmake.

Join the wav2letter community

See the CONTRIBUTING file for how to help out.

License

wav2letter++ is BSD-licensed, as found in the LICENSE file.