XIANGLIU03

XIANGLIU03's Stars

mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
Language:Python3.8k 48 176286
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Language:Python2.1k 19 47201
kohjingyu/fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
Language:Jupyter Notebook478 12 3835
facebookresearch/flip
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
Language:Python408 6 215
Paranioar/SGRAF
[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”
Language:Python213 5 1936
microsoft/BridgeTower
Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"
Language:Python158 12 96
HuiChen24/IMRAM
code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"
Language:Python91 1 929
winycg/CLIP-KD
[CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation
Language:Python81 5 132
winycg/MCL
[AAAI-2022 Oral] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning
Language:Python72 1 34
LinWeizheDragon/FLMR
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
Language:Python71 3 294
RustamyF/clip-multimodal-ml
Language:Jupyter Notebook52 2 06
Go2Heart/EchoSight
[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
Language:Python41 1 82
McGill-NLP/diffusion-itm
Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"
Language:Python31 7 61
OpenMatch/MARVEL
[ACL 2024] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin".
Language:Python30 6 13
xinwei666/MMGenerativeIR
Official Code of our AAAI-24 Paper: "Generative Multi-modal Knowledge Retrieval with Large Language Models".
Language:Python23 5 20
mesnico/ALADIN
Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"
Language:Python22 5 35
vkhoi/cora_cvpr24
Language:Python21 4 30
Saehyung-Lee/PlugIR
Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)
Language:Python18 4 13
96-Zachary/vse_2ad
Language:Python16 2 23
liyongqi67/GRACE
Language:Python12 1 11
AAA-Zheng/Listwise_ITR
Official PyTorch implementation of the paper "Integrating Listwise Ranking into Pairwise-based Image-Text Retrieval"
Language:Python8 1 10
sarahESL/AlignCLIP
AlignCLIP: Improving Cross-Modal Alignment in CLIP
Language:Python8 3 00
Zjamie813/SelfAlign
Language:Python8 1 10
AAA-Zheng/LG_ITM
Official PyTorch implementation of the paper "Integrating Language Guidance into Image-Text Matching for Correcting False Negatives"
Language:Python5 1 21
chrisx599/DSMD
Language:Python4 1 10
Mario0716/SCCMR-master
Soft Contrastive Cross-Modal Retrieval(Pytorch Code)
Language:Jupyter Notebook4 2 00
cluel01/clip-branches
Language:Python3 2 11
FlyCuteBird/MKTLON
The source code of MKTLON
Language:Python3 1 10
scvready123/IterWeGO
This is the implementation of our paper, "Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning".
Language:Python3 1 1
brent-zyy/TVRN
Language:Python2 1 0