xiaojieli0903

Ph.D. candidate at the School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen).

HIT (Shenzhen)Shenzhen

xiaojieli0903's Stars

CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook70.3k 571 72310.4k
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Python40.7k 453 3235.2k
lllyasviel/ControlNet
Let us control diffusion models!
Language:Python31.9k 223 5642.9k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
14.6k 277 138943
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Jupyter Notebook12.7k 98 3561.6k
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python11.5k 83 5391.1k
lucidrains/DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Language:Python11.3k 122 2121.1k
lucidrains/denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Language:Python9.2k 37 3021.1k
lucidrains/imagen-pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Language:Python8.2k 115 301782
open-mmlab/mmdetection3d
OpenMMLab's next-generation platform for general 3D object detection.
Language:Python5.7k 59 1.6k1.6k
lucidrains/DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Language:Python5.6k 95 277639
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
Language:Python3.6k 29 7931.1k
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Language:Python3.2k 80 164215
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language:Python3.2k 34 245261
baaivision/Painter
Painter & SegGPT Series: Vision Foundation Models from BAAI
Language:Python2.6k 36 71179
X-PLUG/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Language:Python2.4k 30 236182
ttengwang/Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
Language:Python1.7k 15 24104
justinpinkney/stable-diffusion
Language:Jupyter Notebook1.5k 22 80270
198808xc/Pangu-Weather
An official implementation of Pangu-Weather
Language:Python1.2k 17 65218
chq1155/A-Survey-on-Generative-Diffusion-Model
943 13 560
EdisonLeeeee/Awesome-Masked-Autoencoders
A collection of literature after or concurrent with Masked Autoencoder (MAE) (Kaiming He el al.).
816 30 152
yxuansu/PandaGPT
[TLLM'23] PandaGPT: One Model To Instruction-Follow Them All
Language:Python783 11 2960
muzairkhattak/multimodal-prompt-learning
[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
Language:Python728 6 8957
HighCWu/ControlLoRA
ControlLoRA: A Lightweight Neural Network To Control Stable Diffusion Spatial Information
Language:Python588 10 1227
Lupin1998/Awesome-MIM
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
Language:Python324 7 617
yzd-v/cls_KD
'NKD and USKD' (ICCV 2023) and 'ViTKD' (CVPRW 2024)
Language:Python228 8 2318
X-PLUG/mPLUG-2
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Language:Python226 4 2519
ju-chen/Efficient-Prompt
Language:Python193 8 1614
adobe-research/affordance-insertion
Language:Python144 19 63
wudongming97/RMOT
[CVPR2023] Referring Multi-Object Tracking
Language:Python132 3 2915

xiaojieli0903

xiaojieli0903's Stars

CompVis/stable-diffusion

Stability-AI/stablediffusion

lllyasviel/ControlNet

BradyFU/Awesome-Multimodal-Large-Language-Models

CompVis/latent-diffusion

mlfoundations/open_clip

lucidrains/DALLE2-pytorch

lucidrains/denoising-diffusion-pytorch

lucidrains/imagen-pytorch

open-mmlab/mmdetection3d

lucidrains/DALLE-pytorch

open-mmlab/mmpretrain

Luodian/Otter

OpenGVLab/Ask-Anything

baaivision/Painter

X-PLUG/mPLUG-Owl

ttengwang/Caption-Anything

justinpinkney/stable-diffusion

198808xc/Pangu-Weather

chq1155/A-Survey-on-Generative-Diffusion-Model

EdisonLeeeee/Awesome-Masked-Autoencoders

yxuansu/PandaGPT

muzairkhattak/multimodal-prompt-learning

HighCWu/ControlLoRA

Lupin1998/Awesome-MIM

yzd-v/cls_KD

X-PLUG/mPLUG-2

ju-chen/Efficient-Prompt

adobe-research/affordance-insertion

wudongming97/RMOT