gauravk95
👨💻 Android and Backend Dev, Experienced in AR, 3D Graphics, ML, Camera, Audio, Video, OTT, Social Media apps. 📱 Building apps for billions
Bangalore, India
gauravk95's Stars
supabase/supabase
The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
graphdeco-inria/gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
neonbjb/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
InstantID/InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
PeterL1n/RobustVideoMatting
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
PaddlePaddle/PaddleGAN
PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on.
HumanAIGC/EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
facebookresearch/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
android/play-billing-samples
Samples for Google Play In-app Billing
Zz-ww/SadTalker-Video-Lip-Sync
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
FACEGOOD/FACEGOOD-Audio2Face
http://www.facegood.cc
Nutlope/notesGPT
Record voice notes & transcribe, summarize, and get tasks
xiaobai1217/Awesome-Video-Datasets
Video datasets
YuanxunLu/LiveSpeechPortraits
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
SociallyIneptWeeb/AICoverGen
A WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
wladradchenko/wunjo.wladradchenko.ru
Wunjo CE: Face Swap, Lip Sync, Control Remove Objects & Text & Background, Restyling, Audio Separator, Clone Voice, Video Generation. Open Source, Local & Free.
Doubiiu/CodeTalker
[CVPR 2023] CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
RenYurui/PIRender
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"
choyingw/SynergyNet
3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
amirbar/speech2gesture
code for training the models from the paper "Learning Individual Styles of Conversational Gestures"
yfeng95/DELTA
Learning Disentangled Avatars with Hybrid 3D Representations. (Face, Body, Hair and Clothing)
KangweiiLiu/Awesome_Audio-driven_Talking-Face-Generation
A curated list of resources of audio-driven talking face generation
zhongshaoyy/Audio2Face
gauravk95/SadTalker-Video
This project is based on SadTalker to implement video lip synthesis.