ilovecv

ilovecv's Stars

QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python14.4k 109 1.1k1.2k
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.1k 49 453386
Kwai-Kolors/Kolors
Kolors Team
Language:Python3.9k 38 138277
Yutong-Zhou-cv/Awesome-Text-to-Image
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
2.2k 75 7189
dvlab-research/LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Language:Python1.9k 11 156132
XLabs-AI/x-flux
Language:Python1.7k 29 118121
TencentARC/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Language:Python1.5k 42 71121
OpenGVLab/VisionLLM
VisionLLM Series
Language:Python940 44 1529
NVlabs/MambaVision
Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Language:Python841 17 4043
TencentARC/SEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language Model
Language:Python755 15 3058
GAIR-NLP/anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Language:Python685 10 4436
buoyancy99/diffusion-forcing
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Language:Python627 6 2330
limuloo/MIGC
[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)
Language:Python548 22 1527
frank-xwang/InstanceDiffusion
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
Language:Python516 8 3729
Zj-BinXia/DiffIR
This project is the official implementation of 'Diffir: Efficient diffusion model for image restoration', ICCV2023
Language:Jupyter Notebook478 5 6820
dvlab-research/LLMGA
This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024 Oral
Language:Python464 13 529
mira-space/MiraData
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
Language:Python374 13 1910
tryonlabs/tryondiffusion
TryOnDiffusion: A Tale of Two UNets Implementation
Language:Jupyter Notebook364 35 1943
showlab/BoxDiff
[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Language:Python253 4 1917
baaivision/EVE
[NeurIPS'24 Spotlight] EVE: Encoder-Free Vision-Language Models
Language:Python235 8 163
feizc/DiT-MoE
Scaling Diffusion Transformers with Mixture of Experts
Language:Python215 6 79
OpenGVLab/MM-Interleaved
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
Language:Python199 4 711
xichenpan/ARLDM
Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Language:Python192 12 2729
HJYao00/DenseConnector
【NeurIPS 2024】Dense Connector for MLLMs
Language:Python142 3 75
AFeng-x/PixWizard
Language:Python123 11 40
microsoft/ReCo
ReCo: Region-Controlled Text-to-Image Generation, CVPR 2023
Language:Jupyter Notebook121 5 1310
Yangyi-Chen/SOLO
[TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"
Language:Jupyter Notebook118 2 84
Kwai-Kolors/MPS
Language:Python116 4 106
fusiming3/MARS
Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
842
zjlww/ardit-web
Language:HTML25 4 01