/Deepfake-using-Wave2Lip

A deep learning model to lip-sync a given video with any given audio. It uses GAN architecture to orchestrate loss reconstruction or training.

Primary LanguageJupyter NotebookMIT LicenseMIT

Wav2Lip: Accurately Lip-sync Videos In Any Language.

Wav2Lip Repository is part of the paper: A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild published at ACM Multimedia 2020.

🧾 Official Paper 📑 Project Page 🔑 Original Repo
Official Paper Page Repo
💡 Colab link
Notebook

Note: This project/paper is whole and sole referenced from Rudrabha.

🧠 Video Output:

👉 Trump Speaking in Telugu (An Indian language):

🗺 Architecture:

This approach generates accurate lip-sync by learning from an already well-trained lip-sync expert. Unlike previous works that employ only a reconstruction loss or train a discriminator in a GAN setup, we use a pre-trained discriminator that is already quite accurate at detecting lip-sync errors. We show that fine-tuning it further on the noisy generated faces hampers the discriminator's ability to measure lip-sync, thus also affecting the generated lip shapes.

🔧 Try it yourself:

  • We need a base video which needs to be lip synched.
  • An audio file of any language to mimic.
  • That's all you need to lip sync.

⚡ Highlights:

  • Lip-sync videos to any target speech with high accuracy 💯
  • The audio source can be any file supported by FFMPEG containing audio data: *.wav, *.mp3 or even a video file, from which the code will automatically extract the audio.

⚠Creator Disclaimer

All results from this open-source code or our demo website should only be used for research/academic/personal purposes only. As the models are trained on the LRS2 dataset, any form of commercial use is strictly prohibited. Please contact us for all further queries.