pyvttt
is a simple Video-to-Text Transcriber written in Python.
This tool uses poetry to manage dependencies and packaging. To install all the dependencies simply run:
poetry install
pyvttt
uses gpu support by default if it's available. There is no need of any additional dependencies for Nvidia
gpus.
If you have an AMD
gpu you can install the ROCm
dependencies uncommenting the torch
and torchvision
specific
lines in
the pyproject.toml
file and running the following command:
poetry lock && poetry install
Don't forget to set the ROCm
environment variable based on your system configuration. Example for ROCm (6.0.0)
and AMD 6700XT
gpu:
export HSA_OVERRIDE_GFX_VERSION=10.3.0
You can run the tool using poetry:
poetry run pyvttt --help
Or you can run the tool using python:
python -m pyvttt --help
Or you can run the tool directly from the directory or add it to your path:
pyvttt --help
usage: pyvttt [-h] [--verbose] [--debug] [--quiet | --no-quiet | -q] [--version] [--url URL [URL ...]] [--file FILE] [--output OUTPUT] [--stdout | --no-stdout | -s] [--threads THREADS] [--cpu | --no-cpu | -c] [--force-download | --no-force-download | -d] [--translate TRANSLATE]
[--summarize SUMMARIZE] [--audio AUDIO [AUDIO ...]]
pyvttt is a simple Video-to-Text Transcriber written in Python.
options:
-h, --help show this help message and exit
--verbose, -v Increase verbosity. Use more than once to increase verbosity level (e.g. -vvv).
--debug Enable debug mode.
--quiet, --no-quiet, -q
Do not print any output/log.
--version Show version and exit.
--url URL [URL ...], -u URL [URL ...]
URL(s) of the video to download and transcribe.
--file FILE, -f FILE Path to file with urls to download and transcribe.
--output OUTPUT, -o OUTPUT
Path to save the transcription.
--stdout, --no-stdout, -s
--threads THREADS, -t THREADS
Number of threads to use. Default is half of the available cores.
--cpu, --no-cpu, -c Force to use CPU instead of GPU.
--force-download, --no-force-download, -d
Force to download the video even if it is already downloaded.
--translate TRANSLATE, -l TRANSLATE
Translate transcription to the specified language. Default is english.
--summarize SUMMARIZE, -m SUMMARIZE
Summarize transcription, you can define a summarization strength between 0 and 100. Suggested value: 90.
--audio AUDIO [AUDIO ...], -a AUDIO [AUDIO ...]
Audio file(s) to process. Supported formats: m4a, mp3, webm, mp4, mpga, wav and mpeg.
# note the double quotes around the url
pyvttt --url "youtube_url" --output transcription.txt
# note the double quotes around the urls
pyvttt --url "youtube_url" "another_youtube_url" --output transcriptions
pyvttt --audio "path/to/audio/file" --output transcription.txt
pyvttt --audio "path/to/audio/file" "path/to/another/audio/file" --output transcriptions
You can choose to translate the audio transcription using the --translate
option:
pyvttt --url "youtube_url" --output transcription.txt --translate it
# or
pyvttt --url "youtube_url" --output transcription.txt --translate italian
- arabic (ar)
- czech (cs)
- german (de)
- english (en)
- spanish (es)
- estonian (et)
- finnish (fi)
- french (fr)
- gujarati (gu)
- hindi (hi)
- italian (it)
- japanese (ja)
- kazakh (kk)
- korean (ko)
- lithuanian (lt)
- latvian (lv)
- burmese (my)
- nepali (ne)
- dutch (nl)
- romanian (ro)
- russian (ru)
- sinhala (si)
- turkish (tr)
- vietnamese (vi)
- chinese (zh)
- afrikaans (af)
- azerbaijani (az)
- bengali (bn)
- persian (fa)
- hebrew (he)
- croatian (hr)
- indonesian (id)
- georgian (ka)
- khmer (km)
- macedonian (mk)
- malayalam (ml)
- mongolian (mn)
- marathi (mr)
- polish (pl)
- pashto (ps)
- portuguese (pt)
- swedish (sv)
- swahili (sw)
- tamil (ta)
- telugu (te)
- thai (th)
- tagalog (tl)
- ukrainian (uk)
- urdu (ur)
- xhosa (xh)
- galician (gl)
- slovene (sl)
You can choose to summarize the transcription using the --summarize
and defining the summarization strength:
pyvttt --url "youtube_url" --output transcription.txt --summarize 90
You can run pyvttt
on multiple videos by using file with urls:
first_video_url_here
second_video_url_here
# this is a commented line that will be ignored
third_video_url_here
pyvttt --file urls.txt --output transcriptions
pyvttt [...] --threads 16
To run the tests simply run:
poetry run test
To update the setup.py
file with the latest dependencies and versions run:
poetry run poetry2setup > setup.py
This project was generated using powerful tools, libraries and pretrained models such as poetry, pydantic, pytest, openai-whisper, pytube, huggingface, facebook/mbart, facebook/bart-cnn and more, I simply put the pieces together. Please check and support all the tools and libraries used in this project.