gaohailiang520

gaohailiang520's Stars

moymix/TaskMatrix
Language:Python34.5k 309 3483.3k
lllyasviel/ControlNet
Let us control diffusion models!
Language:Python29.9k 217 5452.7k
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.3k 218 4592.9k
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Language:Jupyter Notebook9.9k 84 248817
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook9.7k 97 653949
facebookresearch/llama-recipes
Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger
Language:Jupyter Notebook7.8k 68 2271.1k
PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python2.9k 28 179207
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.7k 32 156243
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Language:Python2.2k 32 261242
DmitryRyumin/ICCV-2023-Papers
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
Language:Python918 13 1042
google-deepmind/materials_discovery
Language:Jupyter Notebook869 45 22138
OpenDriveLab/DriveLM
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
Language:HTML805 19 8450
Curt-Park/segment-anything-with-clip
Segment Anything combined with CLIP
Language:Python329 1 423
fundamentalvision/Uni-Perceiver
Language:Python266 9 1821
azshue/TPT
Test-time Prompt Tuning (TPT) for zero-shot generalization in vision-language models (NeurIPS 2022))
Language:Python136 3 1516
MediaBrain-SJTU/EqMotion
[CVPR2023] EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning
Language:Python114 1 813
starmemda/CAMoE
Language:Python94 6 99
IMCCretrieval/ProST
Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval --ICCV2023 Oral
Language:Python89 3 71
whwu95/ATM
【ICCV'2023】What Can Simple Arithmetic Operations Do for Temporal Modeling?
Language:Python73 8 45
chunmeifeng/FedPR
【CVPR 2023】Learning Federated Visual Prompt in Null Space for MRI Reconstruction
Language:Python42 2 14
ArsenalCheng/Meta-Adapter
[NeurIPS 2023] Meta-Adapter
Language:Python35 3 20
cvlab-stonybrook/fsl-rsvae
Language:Python34 6 54
shuanglinyan/CFine
CLIP-Driven Fine-grained Text-Image Person Re-identification
Language:Python33 4 81
tomchen-ctj/OST
【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Language:Python33 4 30
skelemoa/synse-zsl
Official PyTorch code for the ICIP 2021 paper 'Syntactically Guided Generative Embeddings For Zero Shot Skeleton Action Recognition'
Language:Jupyter Notebook29 4 144
wlin-at/MAXI
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)
Language:Python25 3 53
YujieOuO/SMIE
This is an official PyTorch implementation of "Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization" in ACMMM 2023.
Language:Python21 2 34
KPeng9510/RelaMiX
Language:Python19 1 10
Sarinda251/CDFSL-V
Accepted at ICCV '23
Language:Python13 1 30
HuiGuanLab/Lite-MKD
Source code of our MM'23 paper Lite-MKD: A Multi-modal Knowledge Distillation Framework for Lightweight Few-shot Action Recognition
Language:Python3 2 20