Shedima

Shedima's Stars

NVlabs/edm
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
Language:Python1.3k137
simonalexanderson/ListenDenoiseAction
Code to reproduce the results for our SIGGRAPH 2023 paper "Listen Denoise Action"
Language:Python14922
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Language:Python58442
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python11.6k2.4k
scutcsq/DWFormer
DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)
Language:Python453
facebookresearch/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
Language:Python2.7k251
mingyuan-zhang/MotionDiffuse
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
Language:Python83874
DiffPoseTalk/DiffPoseTalk
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Language:Python17516
sstzal/DiffTalk
[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"
Language:Python43841
junyanz/CycleGAN
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
Language:Lua12.3k1.9k
junyanz/pytorch-CycleGAN-and-pix2pix
Image-to-Image Translation in PyTorch
Language:Python22.8k6.3k
SoonminHwang/rgbt-ped-detection
KAIST Multispectral Pedestrian Detection Benchmark [CVPR '15]
Language:MATLAB30861
simonalexanderson/StyleGestures
Language:Python47565
ubisoft/ubisoft-laforge-ZeroEGGS
All about ZeroEggs
Language:Python36159
extreme-assistant/CVPR2024-Paper-Code-Interpretation
cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集，极市团队整理
12.4k2.3k
EvelynFan/FaceFormer
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
Language:Python784134
youngwoo-yoon/Co-Speech_Gesture_Generation
This is an implementation of Robots learn social skills: End-to-end learning of co-speech gesture generation for humanoid robots.
Language:Python719
google-research/google-research
Google Research
Language:Jupyter Notebook33.9k7.8k
limaosen0/DMGNN
The implementation of DMGNN
Language:Python14029
ShenhanQian/SpeechDrivesTemplates
[ICCV 2021] The official repo for the paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".
Language:Python874
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Language:Jupyter Notebook1.1k211
guillefix/transflower-lightning
multimodal transformer
Language:Python759
mli/paper-reading
深度学习经典、新论文逐段精读
26.4k2.4k
NVIDIA/vid2vid
Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
Language:Python8.6k1.2k
TheTempAccount/Co-Speech-Motion-Generation
Freeform Body Motion Generation from Speech
Language:Python19625
uniBruce/Mead
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
Language:Python23826
jixinya/EVP
Code for paper 'Audio-Driven Emotional Video Portraits'.
Language:Jupyter Notebook29749
thuiar/Cross-Modal-BERT
CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis（MM2020）
Language:Python10325
labmlai/annotated_deep_learning_paper_implementations
🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Python54.1k5.6k
alvinliu0/HA2G
[CVPR 2022] Code for "Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation"
Language:Python1279