XIANGLIU03's Stars
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
princeton-nlp/SimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
PKU-DAIR/RAG-Survey
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
LinWeizheDragon/Retrieval-Augmented-Visual-Question-Answering
This is the official repository for Retrieval Augmented Visual Question Answering
mbanani/lgssl
[CVPR 2023] Learning Visual Representations via Language-Guided Sampling
sdc17/UPop
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
RitaRamo/smallcap
SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation
CrossmodalGroup/HREM
Learning Semantic Relationship among Instances for Image-Text Matching, CVPR, 2023
Liuziyu77/RAR
The official implementation of RAR
mesnico/TERAN
Code and Resources for the Transformer Encoder Reasoning and Alignment Network (TERAN), accepted for publication in ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)
AAA-Zheng/Image-Text-Matching-Summary
Summary of Related Research on Image-Text Matching
Cecile-hi/Multimodal-Learning-with-Alternating-Unimodal-Adaptation
Multimodal Learning Method MLA for CVPR 2024
ppanzx/CHAN
Jiaxuan-Li/EVCap
[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
hhc1997/L2RM
lerogo/aaai24_itr_cusa
Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"
LCFractal/TGDT
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training
PKU-ICST-MIPL/MKVSE-TOMM2023
zhangy0822/USER
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024
LuminosityX/HAT
Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'
ZhangXu0963/NPC
The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.
ZhangXu0963/VSL
The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.
CapricornGuang/A3R-Cross-Modal-Large-Model-Image-Retrieval
The formal Implement in our work@CVPR2023 1st Foundation Model Challenge of Cross Modal Track
Paranioar/DBL
[TIP2024] The code of “Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching”
yic20/CoMC
[ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
wzhings/itmAFA
This repo is for the implementation of Enhancing Image-Text Matching with Adaptive Feature Aggregation, ICASSP 2024
lyan62/RobustCap