SikaStar
I am now a fifth-year PhD student at National Engineering Lab for Video Technology in Peking University, Beijing, China
Peking UniversityBeijing China
SikaStar's Stars
open-mmlab/mmdetection
OpenMMLab Detection Toolbox and Benchmark
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
OpenGVLab/InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
ishan0102/vimGPT
Browse the web with GPT-4V and Vimium
microsoft/promptbench
A unified evaluation framework for large language models
IDEA-Research/T-Rex
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
czczup/ViT-Adapter
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
muzairkhattak/multimodal-prompt-learning
[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
shenyunhang/APE
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
longzw1997/Open-GroundingDino
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
OpenGVLab/all-seeing
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"
jianghaojun/Awesome-Parameter-Efficient-Transfer-Learning
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
shoumikchow/bbox-visualizer
Make drawing and labeling bounding boxes easy as cake
lzw-lzw/GroundingGPT
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
LijieFan/LaCLIP
[NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"
amazon-science/prompt-pretraining
Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"
baaivision/CapsFusion
[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale
BAAI-DCAI/Visual-Instruction-Tuning
SVIT: Scaling up Visual Instruction Tuning
Surrey-UP-Lab/RegionSpot
Recognize Any Regions
CVMI-Lab/CoDet
(NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
Koorye/DePT
[CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"
LeapLabTHU/Rank-DETR
[NeurIPS 2023] Rank-DETR for High Quality Object Detection
yuxiaochen1103/FDT
ArsenalCheng/Meta-Adapter
[NeurIPS 2023] Meta-Adapter
Hodasia/Awesome-Vision-Language-Finetune
Awesome List of Vision Language Prompt Papers
cv516Buaa/OV-VG