XIANGLIU03's Stars
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
kohjingyu/fromage
đ§ Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
facebookresearch/flip
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
Paranioar/SGRAF
[AAAI2021] The code of âSimilarity Reasoning and Filtration for Image-Text Matchingâ
microsoft/BridgeTower
Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"
HuiChen24/IMRAM
code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"
winycg/CLIP-KD
[CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation
winycg/MCL
[AAAI-2022 Oral] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning
LinWeizheDragon/FLMR
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
RustamyF/clip-multimodal-ml
Go2Heart/EchoSight
[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
McGill-NLP/diffusion-itm
Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"
OpenMatch/MARVEL
[ACL 2024] This is the code repo for our ACLâ24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin".
xinwei666/MMGenerativeIR
Official Code of our AAAI-24 Paper: "Generative Multi-modal Knowledge Retrieval with Large Language Models".
mesnico/ALADIN
Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"
vkhoi/cora_cvpr24
Saehyung-Lee/PlugIR
Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)
96-Zachary/vse_2ad
liyongqi67/GRACE
AAA-Zheng/Listwise_ITR
Official PyTorch implementation of the paper "Integrating Listwise Ranking into Pairwise-based Image-Text Retrieval"
sarahESL/AlignCLIP
AlignCLIP: Improving Cross-Modal Alignment in CLIP
Zjamie813/SelfAlign
AAA-Zheng/LG_ITM
Official PyTorch implementation of the paper "Integrating Language Guidance into Image-Text Matching for Correcting False Negatives"
chrisx599/DSMD
Mario0716/SCCMR-master
Soft Contrastive Cross-Modal Retrieval(Pytorch Code)
cluel01/clip-branches
FlyCuteBird/MKTLON
The source code of MKTLON
scvready123/IterWeGO
This is the implementation of our paper, "Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning".
brent-zyy/TVRN