Convert slides and script into a video with TTS voiceover.
This is a set of bash scripts to help convert a set of slides into a video with a voiceover.
The voiceover is generated using coqui-ai/TTS.
The scripts are designed to be run on a Linux machine with docker, an nvidia gpu, and ffmpeg installed.
- Clone this repository
- Add a file called
script.txt
and addslides.pdf
.script.txt
should contain the script for the voiceover, one line per slide.slides.pdf
should contain the slides.
- Run
./extract_slides.sh
to extract the slides into individual images usingpdftoppm
. - Run
./generate_voiceover.sh
to generate the voiceover. - You can now use any video editting software like DaVinci Resolve to combine your rasterized slides and voiceovers.
Utility scripts:
- Run
./initialize_durations
to initialize the durations of each slide.- You can tune
durations.txt
to your liking, e.g. for overlay videos.
- You can tune
- Run
./stitch_video.sh
to combine the slides and voiceover into a video.
- For
stitch_video.sh
, you can optionally anoverlays.txt
file to add video overlays to the video.
This should be a txt file where each line has the format:
filename;width:height;position_x:position_y;start_overlay_time;end_overlay_time;start_time
For example, to putoverlay.mp4
in the top-left corner from time 0s to 10s:
overlay.mp4;200:-2:0:0;0:10:0