This project aims to provide applications/pipelines and common code for performing fast
automatic speech recognition
,transcription
andtranslation
using open-source models based on open-ai's whisper project.
Table of Contents
pip install "whisper-stream[{feature},...] @ git+https://github.com/ksquarekumar/whisper-stream.git@main"
This project uses pyenv, mamba and
poetry
to manage environments, dependencies and building wheels.
For correct building of artifacts, this proejct also relies on some poetry plugins:
For available extras/features refer to the
extras
section under[tool.poetry.extras]
project manifest
git clone git+https://github.com/ksquarekumar/whisper-stream.git
curl https://pyenv.run | bash
pyenv install mambaforge-22.9.0-3 && pyenv shell mambaforge-22.9.0-3 && mamba activate base
mamba install poetry
mamba update --name base --update-all
exec $(SHELL)
poetry self add poetry-conda poetry-multiproject-plugin
poetry self update
pyenv global mambaforge-22.9.0-3
exec $(SHELL)
3. Create a project environment (named: whisper_py311
) from the existing conda.yml
manifest.
mamba env create -f conda.yml && mamba activate whisper_py311
4. Initialize poetry
with the correct python
and install project dependencies in a project local virtual environment with poetry
.
mamba activate whisper_py311
poetry env use "$(which python)"
poetry install -E "[list of features,..]"
-
For
development
installs you probably want all of"[dev,test]"
groups sopoetry install
is what you need -
For
non-development
install you probably want to exclude[dev,test]
groups, so install with:
poetry install --only main
pre-commit install --install-hooks
assumes
source
is present in system
- within the
system
python for containers
pip install projects/{feature_set}/requirements.txt
pip install .["{feature_set_extras}",..]
- with
conda
as the system's environment manager
conda install mamba
mamba env update -f conda.yml
pip install projects/{feature_set}/requirements.txt
pip install .["{feature_set_extras}",..]
some
jax
modules are partially vendored from whisper-jax
whisper-stream
is distributed under the terms of the Apache-2.0 license.