Du-Yao's Stars
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
moymix/TaskMatrix
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
HumanAIGC/AnimateAnyone
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
guoyww/AnimateDiff
Official implementation of AnimateDiff.
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
XavierXiao/Dreambooth-Stable-Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
datawhalechina/learn-nlp-with-transformers
we want to create a repo to illustrate usage of transformers in chinese
huggingface/evaluate
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
ChenHsing/Awesome-Video-Diffusion-Models
[CSUR] A Survey on Video Diffusion Models
zoubohao/DenoisingDiffusionProbabilityModel-ddpm-
This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
google-research/magvit
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
Newbeeer/Poisson_flow
Code for NeurIPS 2022 Paper, "Poisson Flow Generative Models" (PFGM)
m-bain/webvid
Large-scale text-video dataset. 10 million captioned short videos.
lucidrains/magvit2-pytorch
Implementation of MagViT2 Tokenizer in Pytorch
YingqingHe/LVDM
LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation
JosephKJ/Awesome-Layout-Generators
An awesome list of layout generation papers
PKU-ICST-MIPL/PosterLayout-CVPR2023
Official repository for "PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout" (CVPR 2023).
SooLab/Free-Bloom
[NeurIPS 2023] Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
kabachuha/InfiNet
Implementation of DiffusionOverDiffusion architecture presented in NUWA-XL in a form of ControlNet-like module on top of ModelScope text2video model for extremely long video generation.
CyberAgentAILab/canvas-vae
Implementation of CanvasVAE: Learning to Generate Vector Graphic Documents, ICCV 2021
CyberAgentAILab/flex-dm
Towards Flexible Multi-modal Document Models [Inoue+, CVPR2023]
ahoarfrost/LookingGlass
pretrained LookingGlass language model for biological read-length DNA sequences, and related models derived from transfer learning
zqp111/diffusion