audio2text

This repo is meant as a guide for generating text based from audio in a wav file. This is different from extracting subtitles or more commonly known as closed captions. The vosk module requires the input audio file to be in WAV format pcm mono. The output generates a file in json format. It is not perfect but it works.

A model is required. I've included a link to a small model which is about 36 megabytes.

Requirements

vosk
Python no lower than 3.8
pip

Installation

git clone git@github.com:c0debreaker/audio2text.git
cd audio2text
python3 -m venv venv
source venv/bin/activate
pip install vosk
wget http://alphacephei.com/vosk/models/vosk-model-small-en-us-0.3.zip
Extract the zip file to the current folder and rename it to model
git clone https://github.com/alphacep/vosk-api
python vosk-api/python/example/test_simple.py <your-input.wav> > <any-filename-output.json>

c0debreaker/audio2text

audio2text

Requirements

Installation