slides-voiceover-generator

Convert slides and script into a video with TTS voiceover.
This is a set of bash scripts to help convert a set of slides into a video with a voiceover.
The voiceover is generated using coqui-ai/TTS.
The scripts are designed to be run on a Linux machine with docker, an nvidia gpu, and ffmpeg installed.

Usage

Clone this repository
Add a file called script.txt and add slides.pdf.
- script.txt should contain the script for the voiceover, one line per slide.
- slides.pdf should contain the slides.
Run ./extract_slides.sh to extract the slides into individual images using pdftoppm.
Run ./generate_voiceover.sh to generate the voiceover.
You can now use any video editting software like DaVinci Resolve to combine your rasterized slides and voiceovers.

Utility scripts:

Run ./initialize_durations to initialize the durations of each slide.
- You can tune durations.txt to your liking, e.g. for overlay videos.
Run ./stitch_video.sh to combine the slides and voiceover into a video.

Notes

For stitch_video.sh, you can optionally an overlays.txt file to add video overlays to the video.
This should be a txt file where each line has the format:
filename;width:height;position_x:position_y;start_overlay_time;end_overlay_time;start_time
For example, to put overlay.mp4 in the top-left corner from time 0s to 10s:
overlay.mp4;200:-2:0:0;0:10:0

dli7319/slides-voiceover-generator

slides-voiceover-generator

Usage

Notes