cmusphinx/sphinxbase

SWIG API is missing a variety of useful functions / tests

dhdaines opened this issue · 2 comments

Hi! I've already added a few missing functions to the SWIG API (in particular, the FSG API was useless without the ability to set the final state index!). However they should also have formalized unit tests.

I am wondering if things like the final_state should also be implemented as properties in Python.

Also we need to make sure that any changes we make remain compatible with the external PyPI pocketsphinx module, which is probably what people are actually using (I am!) - should we change the name of the built-in modules to avoid confusion?

Hey David, thanks for looking into this.

The first huge issue is accuracy of course, but speaking of API the biggest issue is not really some utility functions that people will rarely use but some kind of reasonable streaming API for python. Right now endpointing is still a pain and proper timing in a long stream is still a pain.

The second thing is the error handling which still happens in the log file and people very often confused why error happens since log is disabled. Probably API should dump errors by default and ignore debug messages.

Pypi module is not something I'd recommend to people. They have multiple not so nice design decisions like inclusion of audio code. Also, they still do not transcribe audio files properly, see here: https://stackoverflow.com/questions/53024632/pocketsphinx-python-does-not-return-last-utterance-while-iterating-over-audio

Hmm. The problem is that when you type 'pip install pocketsphinx', you get that module, and it (mostly) works. And it doesn't coexist well with the built-in sphinxbase and pocketsphinx modules. Really it is a problem with that module, though...

Agreed that these API functions aren't super important. I just happened to need them :) because I am using the FSG code to do word-alignment.