sherpa

sherpa is an open-source speech-text-text inference framework using PyTorch, focusing exclusively on end-to-end (E2E) models, namely transducer- and CTC-based models. It provides both C++ and Python APIs.

This project focuses on deployment, i.e., using pre-trained models to transcribe speech. If you are interested in how to train or fine-tune your own models, please refer to icefall.

We also have other similar projects that don't depend on PyTorch: