/aws_transcribe_to_docx

Produce Word Document, CSV or SQLite transcriptions using the automatic speech recognition from AWS Transcribe.

Primary LanguagePythonMIT LicenseMIT

tscribe

Produce Word Document, CSV, SQLite and VTT transcriptions using the automatic speech recognition from AWS Transcribe.

Build Status Coverage Status

PyPI - Python Version PyPI version

Downloads Downloads

Installation

pip install git+https://github.com/clane9/aws_transcribe_to_docx

Results

Time Speaker Comment
0:00:03 spk_0 Is this on?
0:00:05 spk_1 Yep.
0:00:06 spk_0 Great.

Usage

Getting started

Simply import tscribe and point tscribe.write(...) at your .json file.

import tscribe

tscribe.write("output.json")
output.docx written in x seconds.

You can also use tscribe from the command line.

tscribe output.json

See the help message for more details.

tscribe -h

Output formats

Supported output formats include:

  • docx (default)
  • csv
  • sqlite
  • vtt
import tscribe
tscribe.write("output.json", format="docx")
tscribe.write("output.json", format="csv")
tscribe.write("output.json", format="sqlite")
tscribe.write("output.json", format="vtt")
output.docx written in x seconds.
output.csv written in x seconds.
output.db written in x seconds.
output.vtt written in x seconds.

Target directory or filename

You may wish to be explicit in specifying the output filename or directory written to. Use cases may include following a naming convention or operating in a serverless environment.

import tscribe
tscribe.write("output.json", save_as="transcript.docx")
tscribe.write("output.json", save_as="/tmp/transcript.docx")
transcript.docx written in x seconds.
/tmp/transcript.docx written in x seconds.

Combining format and target

import tscribe
tscribe.write("output.json", format="csv", save_as="output/output.csv")
output/output.csv written in x seconds.