orzlh's Stars
exo-explore/exo
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
lllyasviel/ControlNet-v1-1-nightly
Nightly release of ControlNet 1.1
OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
X-PLUG/Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
facebookresearch/paco
This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts, and visualization notebooks.
ultralytics/flickr_scraper
Simple Flickr Image Scraper
sachit-menon/classify_by_description_release
SiTH-Diffusion/SiTH
[CVPR 2024] SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion
layer6ai-labs/xpool
https://layer6ai-labs.github.io/xpool/
wengzejia1/Open-VCLIP
YueYANG1996/LaBo
CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
zjukg/DUET
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Jiamian-Wang/T-MASS-text-video-retrieval
Official implementation of "Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval (CVPR 2024 Highlight)"
OmkarThawakar/composed-video-retrieval
Composed Video Retrieval
yassersouri/ghiaseddin
Author's implementation of the paper "Deep Relative Attributes" (ACCV 2016)
wangyu-ustc/LM4CV
The official implementation of the paper **Learning Concise and Descriptive Attributes for Visual Recognition**
fangkaipeng/ProS
ferjad/I2DFormer
Code for CVPR23 Highlight "I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification" and NeurIPS2022 "I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification"