KaiyangZhou

KaiyangZhou's Stars

CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook68.5k 557 71410.2k
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Language:Python57.5k 414 3.8k6.1k
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python37k 352 1.8k4.6k
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.6k 342 2684.1k
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10k 99 667974
mistralai/mistral-inference
Official inference library for Mistral models
Language:Jupyter Notebook9.8k 126 145870
bpc-clone/bypass-paywalls-chrome-clean
5.2k 42 164409
meta-llama/llama-models
Utilities intended for use with Llama models.
Language:Python4.9k 66 131841
google-deepmind/dm_control
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Language:Python3.8k 128 415668
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Language:Python3.6k 100 161243
louisfb01/best_AI_papers_2022
A curated list of the latest breakthroughs in AI (in 2022) by release date with a clear video explanation, link to a more in-depth article, and code.
3.2k 100 1205
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
Language:Python2.2k 19 82182
Computer-Vision-in-the-Wild/CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
1.2k 38 658
chongzhou96/EdgeSAM
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
Language:Jupyter Notebook942 17 3642
facebookresearch/omnivore
Omnivore: A Single Model for Many Visual Modalities
Language:Python559 19 3139
Jingkang50/OpenPSG
Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
Language:Python426 6 9470
rshaojimmy/MultiModal-DeepFake
[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond
Language:Python365 3 4227
jiawei-ren/diffmimic
[ICLR 2023] DiffMimic: Efficient Motion Mimicking with Differentiable Physics https://arxiv.org/abs/2304.03274
Language:Python277 12 621
Mehooz/vision4leg
Toolkit for vision-guided quadrupedal locomotion research
Language:Python239 3 1628
yuhangzang/ContextDET
Contextual Object Detection with Multimodal Large Language Models
Language:Python204 14 85
google-deepmind/perception_test
Language:Jupyter Notebook193 9 2415
sarahpratt/CuPL
Language:Python171 2 914
ZhangYuanhan-AI/visual_prompt_retrieval
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
Language:Python166 4 117
WisconsinAIVision/MaskPoint
[ECCV 2022] Masked Discrimination for Self-Supervised Learning on Point Clouds
Language:Python93 8 85
shihongl1998/LLM-as-a-blackbox-optimizer
Language:Python67 4 510
yuhangzang/UPT
56 8 101
m-Just/OoD-Bench
Language:Python42 1 63
KaiyangZhou/on-device-dg
On-Device Domain Generalization
Language:Python41 2 24
TencentARC/pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
Language:Python32 5 21
Aaditya-Singh/Low-Shot-Robustness
Code for the ICCV 2023 paper "Benchmarking Low-Shot Robustness to Natural Distribution Shifts"
Language:Python11 4 10