tscribe
Produce Word Document, CSV, SQLite and VTT transcriptions using the automatic speech recognition from AWS Transcribe.
Installation
pip install git+https://github.com/clane9/aws_transcribe_to_docx
Results
Time | Speaker | Comment |
---|---|---|
0:00:03 | spk_0 | Is this on? |
0:00:05 | spk_1 | Yep. |
0:00:06 | spk_0 | Great. |
Usage
Getting started
Simply import tscribe
and point tscribe.write(...)
at your .json
file.
import tscribe
tscribe.write("output.json")
output.docx written in x seconds.
You can also use tscribe
from the command line.
tscribe output.json
See the help message for more details.
tscribe -h
Output formats
Supported output formats include:
docx
(default)csv
sqlite
vtt
import tscribe
tscribe.write("output.json", format="docx")
tscribe.write("output.json", format="csv")
tscribe.write("output.json", format="sqlite")
tscribe.write("output.json", format="vtt")
output.docx written in x seconds.
output.csv written in x seconds.
output.db written in x seconds.
output.vtt written in x seconds.
Target directory or filename
You may wish to be explicit in specifying the output filename or directory written to. Use cases may include following a naming convention or operating in a serverless environment.
import tscribe
tscribe.write("output.json", save_as="transcript.docx")
tscribe.write("output.json", save_as="/tmp/transcript.docx")
transcript.docx written in x seconds.
/tmp/transcript.docx written in x seconds.
Combining format and target
import tscribe
tscribe.write("output.json", format="csv", save_as="output/output.csv")
output/output.csv written in x seconds.