NOTE: This is currently under development.
A collection of utilities related to CTC, with the goal of being fast and highly flexible.
- CTC Decode
- Greedy Decoder
- Beam Search Decoder
- Beam Search Decoder with KenLM
- Beam Search Decoder with user-defined LM
- Python bindings
ctclib
depends on kpu/kenlm.
You must install the following libraries as KenLM dependencies.
- Boost
- Eigen3
For example, if you are using Ubuntu (or some Debian based Linux), you can install them by running the following command:
apt install libboost-all-dev libeigen3-dev
Currently, ctclib
isn't available on crates.io, but you can use this as git dependencies.
[dependencies]
ctclib = { version = "*", git = "https://github.com/agatan/ctclib" }
ctclib
provides python interfaces, named pyctclib
.
Currently, pyctclib
isn't available on PyPI, but you can install this as git dependency.
Ensure that you have installed cargo
and libclang-dev
.
pip install 'git+https://github.com/agatan/ctclib.git#egg=pyctclib&subdirectory=bindings/python'
import pyctclib
decoder = pyctclib.BeamSearchDecoderWithKenLM(
pyctclib.BeamSearchDecoderOptions(
beam_size=100,
beam_size_token=1000,
beam_threshold=1,
lm_weight=0.5,
),
"/path/to/model.arpa",
["a", "b", "c", "_"],
)
decode.decode(log_probs)
# or you can use user-defined LM
# See pyctclib.LMProtocol