KathPra's Stars
google-research/google-research
Google Research
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
lmcinnes/umap
Uniform Manifold Approximation and Projection
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
vikhyat/moondream
tiny vision language model
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
fastai/imagenette
A smaller subset of 10 easily classified classes from Imagenet, and a little more French
beichenzbc/Long-CLIP
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
LAION-AI/CLIP_benchmark
CLIP-like model evaluation
LAION-AI/CLIP-based-NSFW-Detector
MILVLG/bottom-up-attention.pytorch
A PyTorch reimplementation of bottom-up-attention models
berkeley-hipie/HIPIE
[NeurIPS2023] Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"
bjoern-andres/graph
Graphs and Graph Algorithms in C++, including Minimum Cost (Lifted) Multicuts
facebookresearch/isc2021
Code for the Image similarity challenge.
lyakaap/ISC21-Descriptor-Track-1st
The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.
chs20/RobustVLM
[ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
benchaplin/hungarian-algorithm
Python 3 implementation of the Hungarian Algorithm
aimagelab/safe-clip
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024
neuroexplicit-saar/Discover-then-Name
Code for the paper: Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery. ECCV 2024.
ZhouYuxuanYX/MultiMax
This is the official implementation of our ICML 2024 paper "MultiMax: Sparse and Multi-Modal Attention Learning""
zhutong0219/ITIN
Multimodal Sentiment Analysis with Image-Text Interaction Network
ZhouYuxuanYX/Benchmarking-and-Guiding-Adaptive-Sampling-Decoding-for-LLMs
IsaacBravo/streamlit-app
This is an interactive app that allow users play around with the clip model to analyze images
ZhouYuxuanYX/Maximum-Suppression-Regularization
HY-Wong/Thesis
shashankskagnihotri/adv_mmsegmentation
shashankskagnihotri/pruneshift-public