zyuanbing

Student in NLPR, CASIA. Interested in computer vision, especially network architecture design, pursing M.Sc in computer science.

zyuanbing's Stars

ChenDelong1999/subobjects
Official repository of paper "Subobject-level Image Tokenization"
Language:Python494
DirtyHarryLYL/LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
74632
vpulab/ovam
Code for the paper Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models @ CVPR 2024
Language:Python383
JShollaj/awesome-llm-interpretability
A curated list of Large Language Model (LLM) Interpretability resources.
93380
mbanani/probe3d
[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models
Language:Python2025
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python21.4k2.1k
sinahmr/NACLIP
PyTorch Implementation of NACLIP in "Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation"
Language:Python234
yossigandelsman/clip_text_span
official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"
Language:Jupyter Notebook11513
google/diffseg
DiffSeg is an unsupervised zero-shot segmentation method using attention information from a stable-diffusion model. This repo implements the main DiffSeg algorithm and additionally includes an experimental feature to add semantic labels to the masks based on a generated caption.
Language:Jupyter Notebook22315
kyegomez/VisionMamba
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images
Language:Python27513
MengyuWang826/SegRefiner
SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process
Language:Python1248
kyegomez/Vit-RGTS
Open source implementation of "Vision Transformers Need Registers"
Language:Python10011
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
Language:Jupyter Notebook1.2k64
openai/transformer-debugger
Language:Python3.9k231
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python17.3k1.6k
xai-org/grok-1
Grok open release
Language:Python48.9k8.3k
Haiyang-W/GiT
Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"
Language:Python22110
52CV/CVPR-2024-Papers
33618
HarborYuan/ovsam
[arXiv preprint] The official code of paper "Open-Vocabulary SAM".
Language:Python60421
dvlab-research/Prompt-Highlighter
[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Language:Python1042
wangf3014/SCLIP
Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
Language:Python989
lambert-x/ProLab
Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties"
Language:Python48
TransformerLensOrg/TransformerLens
A library for mechanistic interpretability of GPT-style language models
Language:Python998224
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
Language:Python2k170
MaverickRen/PixelLM
PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.
Language:Python1244
ytongbai/LVM
1.6k49
dvlab-research/LLaMA-VID
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Language:Python60338
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25k2.9k
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python35k4.3k
LLaVA-VL/LLaVA-Plus-Codebase
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
Language:Python64249

zyuanbing

zyuanbing's Stars

ChenDelong1999/subobjects

DirtyHarryLYL/LLM-in-Vision

vpulab/ovam

JShollaj/awesome-llm-interpretability

mbanani/probe3d

meta-llama/llama3

sinahmr/NACLIP

yossigandelsman/clip_text_span

google/diffseg

kyegomez/VisionMamba

MengyuWang826/SegRefiner

kyegomez/Vit-RGTS

mhamilton723/FeatUp

openai/transformer-debugger

hpcaitech/Open-Sora

xai-org/grok-1

Haiyang-W/GiT

52CV/CVPR-2024-Papers

HarborYuan/ovsam

dvlab-research/Prompt-Highlighter

wangf3014/SCLIP

lambert-x/ProLab

TransformerLensOrg/TransformerLens

lucidrains/vector-quantize-pytorch

MaverickRen/PixelLM

ytongbai/LVM

dvlab-research/LLaMA-VID

Vision-CAIR/MiniGPT-4

lm-sys/FastChat

LLaVA-VL/LLaVA-Plus-Codebase