zero-shot-classification

There are 119 repositories under zero-shot-classification topic.

  • mlfoundations/open_clip

    An open source implementation of CLIP.

    Language:Python12.6k805591.2k
  • roboflow/notebooks

    A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2, Florence-2, PaliGemma 2, and Qwen2.5VL.

    Language:Jupyter Notebook8.4k1031621.3k
  • OpenGVLab/InternVideo

    [ECCV2024] Video Foundation Models & Data for Multimodal Understanding

    Language:Python2k27277126
  • diffusion-classifier/diffusion-classifier

    Diffusion Classifier leverages pretrained diffusion models to perform zero-shot classification without additional training

    Language:Python471143441
  • nlpodyssey/cybertron

    Cybertron: the home planet of the Transformers in Go

    Language:Go320113728
  • UCSC-VLAA/CLIPA

    [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"

    Language:Python316141214
  • Colin97/OpenShape_code

    official code of “OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding”

    Language:Python282104621
  • LAION-AI/scaling-laws-openclip

    Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)

    Language:Jupyter Notebook1768113
  • salesforce/MUST

    PyTorch code for MUST

    Language:Python10761012
  • zhengli97/ATPrompt

    [ICCV 2025] Official PyTorch Code for "Advancing Textual Prompt Learning with Anchored Attributes"

    Language:Python96192
  • HieuPhan33/CVPR2024_MAVL

    Multi-Aspect Vision Language Pretraining - CVPR2024

    Language:Python82251
  • hfapigo

    Kardbord/hfapigo

    Unofficial (Golang) Go bindings for the Hugging Face Inference API

    Language:Go622145
  • elkhouryk/RS-TransCLIP

    [ICASSP 2025] Open-source code for the paper "Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification"

    Language:Python60132
  • tmlr-group/WCA

    [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"

    Language:Python56003
  • text-to-image-eval

    encord-team/text-to-image-eval

    Evaluate custom and HuggingFace text-to-image/zero-shot-image-classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. Metrics include Zero-shot accuracy, Linear Probe, Image retrieval, and KNN accuracy.

    Language:Jupyter Notebook54333
  • akshitac8/Generative_MLZSL

    [TPAMI 2023] Generative Multi-Label Zero-Shot Learning

    Language:Python5231614
  • shiming-chen/MSDN

    Official PyTorch Implementation of MSDN (CVPR'22)

    Language:Python523115
  • rhysdg/vision-at-a-clip

    Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts

    Language:Jupyter Notebook431201
  • GT4SD/zero-shot-bert-adapters

    Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.

    Language:Python4261010
  • anastasiia-p/airflow-ml

    Airflow Pipeline for Machine Learning

    Language:Python38108
  • filipbasara0/simple-clip

    A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch

    Language:Jupyter Notebook37105
  • PrithivirajDamodaran/Alt-ZSC

    Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models to do ZSC. Hence, can be lightweight + supports more languages without trading-off accuracy. (Super simple, a 10th-grader could totally write this but since no 10th-grader did, I did) - Prithivi Da

    Language:Python37115
  • ronaldseoh/atsc_prompts

    Codes for the experiments in our EMNLP 2021 paper "Open Aspect Target Sentiment Classification with Natural Language Prompts"

    Language:Jupyter Notebook36444
  • UCSC-VLAA/MixCon3D

    [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"

    Language:Python36253
  • baskargroup/BioTrove

    NeurIPS 2024 Track on Datasets and Benchmarks (Spotlight)

    Language:Jupyter Notebook32124
  • pha123661/NTU-2022Fall-DLCV

    Deep Learning for Computer Vision 深度學習於電腦視覺 by Frank Wang 王鈺強

    Language:Python25112
  • MSQNet

    mondalanindya/MSQNet

    Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]

    Language:Python24101
  • yueyu1030/ReGen

    [ACL'23 Findings] This is the code repo for our ACL'23 Findings paper "ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval".

    Language:Python24013
  • HanaFEKI/AI_BasketBall_Analysis_v1

    The system detects players and the ball with YOLO, assigns teams via zero-shot jersey classification, tracks ball possession, maps court keypoints, transforms the view to top-down, and calculates player speed and distance.

    Language:Python22
  • visresearch/DGMR

    The official implementation of "Diversity-Guided MLP Reduction for Efficient Large Vision Transformers"

    Language:Python19
  • cloudera/CML_AMP_Few-Shot_Text_Classification

    Perform topic classification on news articles in several limited-labeled data regimes.

    Language:Jupyter Notebook18707
  • CogComp/Benchmarking-Zero-shot-Text-Classification

    Code for EMNLP2019 paper : "Benchmarking zero-shot text classification: datasets, evaluation and entailment approach"

    Language:Python1861
  • VectorInstitute/mmlearn

    A toolkit for research on multimodal representation learning

    Language:Python18323
  • ytaek-oh/fsc-clip

    [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality

    Language:Python18200
  • JinhaoLee/WCA

    [ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models

    Language:Python17136
  • KimRass/CLIP

    PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k

    Language:Python12100