Simple command line tool to create text transcripts out of audio files using IBM Watson Speech to Text.
Using PyPi is the easiest way:
$ pip install speech-to-text
Or installing the dev version:
$ git clone https://github.com/rmotr/speech-to-text
$ mkvirtualenv speech-to-text
$ pip install -r requirements.txt
The first thing you'll need to do is get your Bluemix Username and Password. This is a tedious process, if you have issues, we've written a blog post that describes how to do it. Once you have your username and password you can do:
$ speech_to_text -u <MY-USERNAME> -p <MY-PASSWORD> -f html -i <AUDIO-FILE> transcript.html
(You can omit the password option and you'll be prompted to type it in a secure manner.)
The -i
option receives the audio file that you want to transcript, and it'll store the text in transcript.html
in HTML format. To select a different format, see below..
There are currently 4 formatters builtin: html
(default), markdown
, json
, original
. You can pass the -f
option with any of those formatters in place.
Under the examples/
directory you can find a short audio file containing the first 30 seconds of Jacob Kaplan-Moss Keynote from Pycon 2015. There are also the end results of the transcription (html and markdown format).
https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#recognize_sessionless_nonmp12
Audio File types supported:
- audio/flac
- audio/l16
- audio/wav
- audio/ogg;codecs=opus
- audio/mulaw