Language bindings for Vosk and Kaldi to access high-quality speech recognition algorithms from various languages and on various platforms.
- Python on Linux, Windows and Raspberry Pi
- Node
- Java
- Android
- iOS
This is a fork of alphacep/vosk-api aimed at providing timely releases and minor additions to the upstream project. I intend to provide PRs for my changes, hoping that they would get merged so that this fork can be archived.
Changes: the list of changes in this fork can be seen using GitHub compare view.
Releases: the latest release can be downloaded from the Releases page.
While the upstream project uses three-number version strings, this fork appends a fourth number so as to indicate which vosk-api release it's based on and not clash with the upstream releases.
Development Plans
- Create automatic build pipeline
- Produce easily-installable packages for Windows
- Produce easily-installable packages for Mac
- Publish Java JAR to Maven Central
cd android
gradle build
Please note that medium blog post about 64-bit is not relevant anymore, the script builds x86, arm64 and armv7 libraries automatically without any modifications.
The easiest way to install Vosk is with pip
. You do not have to compile anything. We currently support Linux on x86_64 and Raspberry Pi (armv6 and armv7). Mac builds will come soon.
Make sure you have newer pip and python:
- Python version >= 3.5
- pip version >= 19.0
Uprade python and pip if needed. Then install Vosk on Linux with a simple command
pip3 install vosk
If you still want to build from scratch, you can compile Kaldi and Vosk yourself. The compilation is straightforward but might be a little confusing for a newbie. In case you want to follow this, please watch the errors.
git clone -b lookahead --single-branch https://github.com/alphacep/kaldi
cd kaldi/tools
make
install all dependencies and repeat make
if needed
extras/install_openblas.sh
cd ../src
./configure --mathlib=OPENBLAS --shared --use-cuda=no
make -j 10
cd java && KALDI_ROOT=<KALDI_ROOT> make
wget https://github.com/alphacep/kaldi-android-demo/releases/download/2020-01/alphacep-model-android-en-us-0.3.tar.gz
tar xf alphacep-model-android-en-us-0.3.tar.gz
mv alphacep-model-android-en-us-0.3 model
make run
Then build the python module
export KALDI_ROOT=<KALDI_ROOT>
cd python
python3 setup.py install
Run like this:
cd vosk-api/python/example
wget https://github.com/alphacep/kaldi-android-demo/releases/download/2020-01/alphacep-model-android-en-us-0.3.tar.gz
tar xf alphacep-model-android-en-us-0.3.tar.gz
mv alphacep-model-android-en-us-0.3 model
python3 ./test_local.py test.wav
There are models for other languages available too.
To run with your audio file make sure it has proper format - PCM 16khz 16bit mono, otherwise decoding will not work.
Microphone example will come soon.