ZeyueT

ZeyueT's Stars

VideoVerses/VideoTuna
Let's finetune video generation models!
Language:Python34813
zarathustr/LibQPEP
TRO 2022 - QPEP: A C++/MATLAB library for solving generalized quadratic pose estimation problems and related uncertainty description
Language:MATLAB17517
ZeyueT/VidMuse
Language:Python40
bronyayang/Law_of_Vision_Representation_in_MLLMs
Official implementation of the Law of Vision Representation in MLLMs
Language:Python1458
FionaFN/MultiTarget_WiFi_DFL
Language:Jupyter Notebook1
atong01/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
Language:Python1.4k117
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python2.8k272
litwellchi/MMTrail
[Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Language:Python271
gcui-art/suno-api
Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.
Language:TypeScript1.6k386
seungheondoh/lp-music-caps
LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]
Language:Python29434
qiuqiangkong/audioset_tagging_cnn
Language:Python1.4k259
JeremyCJM/DiffSHEG
[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation
Language:Python13711
csteinmetz1/ai-audio-startups
Community list of startups working with AI in audio and music technology
1.6k139
hf-lin/ChatMusician
Language:Python22425
yzxing87/Seeing-and-Hearing
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Language:Python1377
YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Language:HTML40123
LianQi-Kevin/wav2clip-changed
Language:Python21
FurkanGozukara/Stable-Diffusion
FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod
Language:Jupyter Notebook2.2k305
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36.6k4.3k
Lionelsy/Conference-Accepted-Paper-List
Some Conferences' accepted paper lists (including AI, ML, Robotic)
98674
hollobit/GenAI_LLM_timeline
ChatGPT, GenerativeAI and LLMs Timeline
94759
zhvng/open-musiclm
Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.
Language:Python53360
lucidrains/denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Language:Python8.7k1.1k
mayuelala/FollowYourPose
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
Language:Python1.3k90
wzk1015/video-bgm-generation
[ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer
Language:Python29935
ChenyangQiQi/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
Language:Jupyter Notebook1.1k106
jiaxinxie97/HFGI3D
Language:Jupyter Notebook20316
ChenyangLEI/All-In-One-Deflicker
[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas
Language:Python71942