/pykaldi

Python wrapper for Kaldi decoders (Kaldi https://sourceforge.net/projects/kaldi/)

Primary LanguagePythonOtherNOASSERTION

Fork of Kaldi for developing custom recognisers for Alex spoken dialogue system framework

News & info

  • We use Docker so you can try easily our decoding demo

    • Run the demo using the two commands:

      1. download image docker pull ufaldsg/pykaldi

      2. run the demo docker run ufaldsg/pykaldi /bin/bash -c "cd online_demo; make gmm-latgen-faster; make online-recogniser; make pyonline-recogniser"

        • Note the demo downloads the pretrained models and test data which you may safe using docker commit functionality
    • Start exploring the demo source codes online_demo/pykaldi-online-latgen-recogniser.py and onl-rec/onl-rec-latgen-recogniser-demo.cc

    • Please note, that you need to change the source code of Pykaldi in the docker image to effect the demo behaviour when using docker.

  • The Python wrapper of C++ OnlineLatticeRecogniser implements MFCC, LDA+MLLT, bMMI acoustic models since it was the best speaker independent setup.

  • UPDATE: Since 11/18/2014 the Pykaldi fork uses the Kaldi official code (src/online2) which has very similar as our previous implementation (and was finished roughly 8 month after our implementations).

Install

https://travis-ci.org/UFAL-DSG/pykaldi.svg?branch=master
  • Our priority is to deploy it on Ubuntu 14.04 and also keep Travis running on Ubuntu 12.04
  • Read INSTALL.rst and INSTALL first!
  • INSTALL.rst contains instructions specific for this fork. INSTALL stores general instructions for Kaldi.

LICENSE

History

The fork presented three new Kaldi features in thesis of Ondrej Platek (see commit 8e534b16bb8a350): * Training scripts which can be used with standard Kaldi tools or with the new OnlineLatticeRecogniser.

The scripts for Czech and English support acoustic models obtained using MFCC, LDA+MLLT/delta+delta-delta feature transformations and acoustic models trained generatively or by MPE or bMMI training.
The new functionality was separated to different directories:
  • pykaldi/src/onl-rec stores C++ code for OnlineLatticeRecogniser.
  • pykaldi/pykaldi stores Python wrapper PyOnlineLatticeRecogniser.
  • kaldi/egs/vystadial_{cz,en}/s5 stores training scripts. [merged to oficial Kaldi repo]
  • kaldi/online_demo shows Kaldi standard decoder, OnlineLatticeRecogniser and PyOnlineLatticeRecogniser, which produce the exact same lattices using the same setup.

The OnlineLatticeRecogniser is used in Alex dialogue system (https://github.com/UFAL-DSG/alex).

In March 2014, the PyOnlineLatticeRecogniser recogniser was evaluated on Alex domain. See graphs evaluating OnlineLatticeRecogniser performance at http://nbviewer.ipython.org/github/oplatek/pykaldi-eval/blob/master/Pykaldi-evaluation.ipynb.

An example posterior word lattice output for one Czech utterance can be seen at http://oplatek.blogspot.it/2014/02/ipython-demo-pykaldi-decoders-on-short.html

Other info

  • This Kaldi fork is developed under Vystadial project.
  • Based on the Svn trunk of Kaldi project which is mirrored to branch svn-mirror.
  • The svn trunk is mirrored via git svn. Checkout tutorials: Git svn, Svn branch in git