hyqyoung's Stars
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
langgptai/LangGPT
LangGPT: Empowering everyone to become a prompt expert!🚀 Structured Prompt,Language of GPT, 结构化提示词,结构化Prompt
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
hustvl/Vim
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
InternLM/InternLM-XComposer
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
MzeroMiko/VMamba
VMamba: Visual State Space Models,code is based on mamba
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
wangkai930418/awesome-diffusion-categorized
collection of diffusion model papers categorized by their subareas
yyyujintang/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
pprp/awesome-attention-mechanism-in-cv
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
Event-AHU/Mamba_State_Space_Model_Paper_List
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
penghao-wu/vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
longzw1997/Open-GroundingDino
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
Jingkang50/OpenPSG
Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
JindongGu/Awesome-Prompting-on-Vision-Language-Model
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
Paranioar/Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
WisconsinAIVision/ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Charles-Xie/awesome-described-object-detection
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull requests welcomed.
AlonzoLeeeooo/awesome-image-inpainting-studies
A collection of awesome image inpainting studies.
zifuwan/Sigma
[WACV 2025] Python implementation of Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation
sarahpratt/CuPL
wengzejia1/Open-VCLIP
king159/Pair-Net
[IEEE TPAMI-2024] Pair then Relation: Pair-Net for Panoptic Scene Graph Generation
Liuziyu77/RAR
The official implementation of RAR
BCV-Uniandes/PNG
cjw2021/QAHOI
zyong812/STIP
Code for CVPR22 paper: Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection.
jasonseu/SALGL
An official codebase of Scene-Aware Label Graph Learning for Multi-Label Image Classification, ICCV 2023.
hutuo1213/CLIPViC