ZeyueT's Stars
VideoVerses/VideoTuna
Let's finetune video generation models!
zarathustr/LibQPEP
TRO 2022 - QPEP: A C++/MATLAB library for solving generalized quadratic pose estimation problems and related uncertainty description
ZeyueT/VidMuse
bronyayang/Law_of_Vision_Representation_in_MLLMs
Official implementation of the Law of Vision Representation in MLLMs
FionaFN/MultiTarget_WiFi_DFL
atong01/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
litwellchi/MMTrail
[Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
gcui-art/suno-api
Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.
seungheondoh/lp-music-caps
LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]
qiuqiangkong/audioset_tagging_cnn
JeremyCJM/DiffSHEG
[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation
csteinmetz1/ai-audio-startups
Community list of startups working with AI in audio and music technology
hf-lin/ChatMusician
yzxing87/Seeing-and-Hearing
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
LianQi-Kevin/wav2clip-changed
FurkanGozukara/Stable-Diffusion
FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Lionelsy/Conference-Accepted-Paper-List
Some Conferences' accepted paper lists (including AI, ML, Robotic)
hollobit/GenAI_LLM_timeline
ChatGPT, GenerativeAI and LLMs Timeline
zhvng/open-musiclm
Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.
lucidrains/denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
mayuelala/FollowYourPose
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
wzk1015/video-bgm-generation
[ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer
ChenyangQiQi/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
jiaxinxie97/HFGI3D
ChenyangLEI/All-In-One-Deflicker
[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas