Cb1ock's Stars
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
YuanGongND/cav-mae
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
MSA-LMC/S2D
[TAFFC 2024] The official implementation of paper: From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
deepinsight/insightface
State-of-the-art 2D and 3D Face Analysis Project
facebookresearch/MovieGenBench
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
HFUTTUG/HFUT-Beamer
Collection of Beamer themes for Hefei University of Technology
Zejun-Yang/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
GenjiB/AVSiam
Siamese Vision Transformers are Scalable Audio-visual Learners
Xuemantou/R3nzSkin-For-China-Server
Skin changer for League of Legends (LOL)
ZebangCheng/Emotion-LLaMA
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
nttcslab/msm-mae
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations
facebookresearch/mae_st
Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"
stoneMo/DeepAVFusion
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
52CV/CVPR-2024-Papers
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
zeroQiaoba/MERTools
Toolkits for Multimodal Emotion Recognition
MuiseDestiny/zotero-reference
PDF references add-on for Zotero.
HumanAIGC/EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
ad-m/github-push-action
GitHub actions to push back to repository eg. updated code
ZhuoYulang/CIF-MMIN
facebookresearch/MAViL
The repo host the code and model of MAViL.
xai-org/grok-1
Grok open release
sunlicai/MAE-DFER
MAE-DFER: Efficient Masked Autoencoder for Self-supervised Dynamic Facial Expression Recognition (ACM MM 2023)
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
TadasBaltrusaitis/OpenFace
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
microsoft/generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
microsoft/AI-For-Beginners
12 Weeks, 24 Lessons, AI for All!