/openai_whisper_cli

nlp, openai, whisper, voice to text

Primary LanguagePython

Tests Publish

openai whisper cli - converts speech into text

This project aims to develop a Python command-line interface (CLI) that converts speech into text. The project utilizes a pre-trained model from the hugging-face library, which is implemented in Python, to perform the conversion.

Project Goals/Outcomes

  • Develop my python cli with openai whisper
  • Use Github Codespaces and Copilot
  • Integrate libtorch and 'hugging-face pretrained models' into a python cli project

Architectural Diagram

image

Demo

image

Setup Manually

  1. Install python
  2. Install dependencies
make install

  • Run the cli
python hello.py --path audio.mp3

Or run with package

Install dependencies

make setup

  • Run the cli
whisper --path audio.mp3

Docker(Recommended)

  • This repo main branch is automatically published to Dockerhub with CI/CD, you can pull the image from here
docker pull szheng3/whisper-ml-cli:latest
  • Run the docker image.
docker run szheng3/whisper-ml-cli:latest 'audio.mp3'
  • With your own audio file
docker run -v /path/to/your/audio:/app/audio szheng3/whisper-ml-cli:latest /app/audio/audio.mp3

CI/CD

Github Actions configured in .github/workflows

Progress Log

  • Configure Github Codespaces.
  • Initialise python project with pretrained model from hugging-face
  • CI/CD with Github Actions
  • Tag and Releases

References