This demo showcases my journey in creating custom human avatar videos from textual inputs. It involves using an initial video, several voice samples (either yours or someone else's), and a chosen text for the avatar to vocalize. The final product is a video featuring the designated face and synchronized voice narration, bringing the textual content to life in a visually engaging manner.
See collab notebook for details: link
- SimSwap for its face swaping feature
- Tortoise-tts for its text-to-speech feature
- Wav2Lip for its lip sync and generation