Shedima's Stars
NVlabs/edm
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
simonalexanderson/ListenDenoiseAction
Code to reproduce the results for our SIGGRAPH 2023 paper "Listen Denoise Action"
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
scutcsq/DWFormer
DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)
facebookresearch/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
mingyuan-zhang/MotionDiffuse
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
DiffPoseTalk/DiffPoseTalk
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
sstzal/DiffTalk
[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"
junyanz/CycleGAN
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
junyanz/pytorch-CycleGAN-and-pix2pix
Image-to-Image Translation in PyTorch
SoonminHwang/rgbt-ped-detection
KAIST Multispectral Pedestrian Detection Benchmark [CVPR '15]
simonalexanderson/StyleGestures
ubisoft/ubisoft-laforge-ZeroEGGS
All about ZeroEggs
extreme-assistant/CVPR2024-Paper-Code-Interpretation
cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理
EvelynFan/FaceFormer
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
youngwoo-yoon/Co-Speech_Gesture_Generation
This is an implementation of Robots learn social skills: End-to-end learning of co-speech gesture generation for humanoid robots.
google-research/google-research
Google Research
limaosen0/DMGNN
The implementation of DMGNN
ShenhanQian/SpeechDrivesTemplates
[ICCV 2021] The official repo for the paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
guillefix/transflower-lightning
multimodal transformer
mli/paper-reading
深度学习经典、新论文逐段精读
NVIDIA/vid2vid
Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
TheTempAccount/Co-Speech-Motion-Generation
Freeform Body Motion Generation from Speech
uniBruce/Mead
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
jixinya/EVP
Code for paper 'Audio-Driven Emotional Video Portraits'.
thuiar/Cross-Modal-BERT
CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis(MM2020)
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
alvinliu0/HA2G
[CVPR 2022] Code for "Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation"