API examples using 8-mile for audio
The codebase relies primarily on 8-mile
(mead-layers
) for its modeling and optimization code.
Whats left is pretty much just training and inference code
The code depends on:
editdistance
(for error evaluation)numpy
six
soundfile
mead-baseline
pytorch
There are a few optional dependencies
scipy
(for on-the-fly resampling of wav files)ctcdecode
(for prefix beam decoding with optional LM)