olccihyeon's Stars
joonspk-research/generative_agents
Generative Agents: Interactive Simulacra of Human Behavior
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
stas00/ml-engineering
Machine Learning Engineering Open Book
dabeaz-course/python-mastery
Advanced Python Mastery (course by @dabeaz)
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
teddylee777/machine-learning
머신러닝 입문자 혹은 스터디를 준비하시는 분들에게 도움이 되고자 만든 repository입니다. (This repository is intented for helping whom are interested in machine learning study)
cvlab-columbia/viper
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
likejazz/llama3.np
llama3.np is a pure NumPy implementation for Llama 3 model.
NUS-HPC-AI-Lab/Neural-Network-Parameter-Diffusion
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
gabriben/awesome-generative-information-retrieval
xmed-lab/CLIP_Surgery
CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks
ziplab/SN-Net
[CVPR 2023 Highlight] This is the official implementation of "Stitchable Neural Networks".
kongds/E5-V
E5-V: Universal Embeddings with Multimodal Large Language Models
haokunwen/Awesome-Composed-Image-Retrieval
Collection of Composed Image Retrieval (CIR) papers.
navervision/lincir
Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)
TIGER-AI-Lab/UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
muzairkhattak/ProText
[AAAI'25, CVPRW 2024] Official repository of paper titled "Learning to Prompt with Text Only Supervision for Vision-Language Models".
umd-huang-lab/perceptionCLIP
Code for our ICLR 2024 paper "PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts"
Code-kunkun/ZS-CIR
[BMVC 2023] Zero-shot Composed Text-Image Retrieval
OpenMatch/UniVL-DR
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval".
facebookresearch/Whac-A-Mole
Code for the paper "A Whac-A-Mole Dilemma Shortcuts Come in Multiples Where Mitigating One Amplifies Others"
haofanwang/cropimage
A simple toolkit for detecting and cropping main body from pictures. Support face and saliency detection.
lezhang7/Enhance-FineGrained
[CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
zycheiheihei/Transferable-Visual-Prompting
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompting for Multimodal Large Language Models" has been accepted in CVPR2024.
JUNJIE99/VISTA_Evaluation_FineTuning
Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.
luomancs/ReMuQ
a multimodal retrieval dataset
suoych/KEDs
Implementation of the paper Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval (CVPR 2024)
levymsn/LaSCo
Official repository of the LaSCo dataset
tmlabonte/last-layer-retraining
Official codebase for the NeurIPS 2023 paper: Towards Last-layer Retraining for Group Robustness with Fewer Annotations. https://arxiv.org/abs/2309.08534
clause-bielefeld/wikiscenes_descriptions
Datatset of annotated text image alignments for Wikiscenes (a dataset of multimodal Wikipedia articles on buildings)