This notebook combines different state-of-the-art image and speech generation neural networks into one single Google Colab Notebook so that we can generate a random fake person's talking head video replying to our input text question.
- Face Generation - www.thispersondoesnotexist.com - StyleGAN2
- Text Generation - www.textsynth.org - OpenAI GPT-2
- Text-to-Speech Conversion - https://github.com/NVIDIA/flowtron - Flowtron
- Lip Animation - https://github.com/Rudrabha/LipGAN - LipGAN
- Use motion model to animate the face before performing lip-sync.
- Use the newer GPT-3 model for better, more coherent text responses.