cross-modality

There are 38 repositories under cross-modality topic.

jina-ai/clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Language:Python12.7k 222 6152.1k
zai-org/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Language:Python6.7k 71 441440
KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Language:Python2.2k 21 175386
hangzhaomit/Sound-of-Pixels
Codebase for ECCV18 "The Sound of Pixels"
Language:Python385 15 1374
layumi/Image-Text-Embedding
TOMM2020 Dual-Path Convolutional Image-Text Embedding with Instance Loss :feet: https://arxiv.org/abs/1711.05535
Language:MATLAB295 11 1873
movienet/movienet-tools
Tools for movie and video research
Language:C++294 10 4137
haofanwang/awesome-conditional-content-generation
Update-to-data resources for conditional content generation, including human motion generation, image or video generation and editing.
276 15 127
sail-sg/ptp
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
Language:Python152 8 104
bismex/Awesome-cross-modality-person-re-identification
Awesome Cross-modality Person Re-identification
148 8 132
ZYK100/LLCM
[CVPR 2023] Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification
Language:Python130 3 3413
Event-AHU/EventVOT_Benchmark
[CVPR-2024] The First High Definition (HD) Event based Visual Object Tracking Benchmark Dataset
Language:Python125 4 345
AnjanDutta/sem-pcyc
PyTorch implementation of the paper "Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval", CVPR 2019.
Language:Python110 10 3124
rhgao/co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
Language:Python97 3 1622
mangye16/Visible-Thermal-Person-Re-Identification
Demo code for visible thermal (cross-modality) person re-identification
Language:Python90 3 718
WinfredGe/T2S
[IJCAI 2025] Official implementation of "T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models"
Language:Python56
AdityaLab/MM4TSA
A professional list on Multi-Modalities For Time Series Analysis (MM4TSA) Papers and Resource.
540
JDAI-CV/CM-NAS
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)
Language:Python48 1 413
chenjingong/DN-ReID
[CVPR2024]Day-Night Cross-domain Vehicle Re-identification
Language:Python44 2 71
M-3LAB/awesome-multimodal-brain-image-systhesis
40 1 06
workingcoder/MCJA
A New Strong and Simple Baseline Method for VI-ReID (Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification)
Language:Python40 1 20
ZYK100/MMN
Pytorch code for Towards a Unified Middle Modality Learning for Visible-Infrared Person Re-Identification
Language:Python39 1 55
GuiyuZhao/VRHCF
[ICME 2024] VRHCF: Cross-Source Point Cloud Registration via Voxel Representation and Hierarchical Correspondence Filtering
Language:Python30 2 64
catalina17/VideoNavQA
An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)
Language:Python25 3 31
zjzsliyang/CrossLeak
Code for the WWW'20 paper "Nowhere to Hide: Cross-modal Identity Leakage between Biometrics and Devices"
Language:Python23 3 05
heitorrapela/HalluciDet
[WACV2024] HalluciDet: Hallucinating RGB Modality for Person Detection Through Privileged Information (Accepted at WACV 2024 and LatinX@CVPR2024 Extended Abstract)
Language:Python22 1 40
JacobYuan7/OCN-HOI-Benchmark
[AAAI 2022] Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics.
Language:Python19 2 42
Mithunjha/EarEEG_KnowledgeDistillation
Official implementation of "A Knowledge Distillation Framework for Enhancing Ear-EEG based Sleep Staging with Scalp-EEG Data"
Language:Jupyter Notebook16 2 25
Da1yuqin/TCDiff
Official code for our AAAI25 oral👑 paper Harmonious Group Choreography with Trajectory-Controllable Diffusion — hope you enjoy exploring it! 😊
Language:Python14 2 22
MIS-DevWorks/FBR
This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning," presented at CVPR 2024.
Language:Python11 1 21
PAGF188/RAXO
[ICCV 2025] Superpowering Open-Vocabulary Object Detectors for X-ray Vision
Language:Python9
GWxuan/CL-Gait
[ECCV 2024] Camera-LiDAR Cross-modality Gait Recognition
7 3 10
Atmegal/Comprehensive-Distance-Preserving-Autoencoders-for-Cross-Modal-Retrieval
The code of Comprehensive Distance-Preserving Autoencoders for Cross-Modal Retrieval
Language:Python6 0 00
llcing/Cross-modal-hashing-SRLCH
Codes of our work SRLCH
Language:MATLAB6 1 03
mkang315/MCTSeg
[Preprint] Official implementation of "A Multimodal Feature Distillation with CNN-Transformer Network for Brain Tumor Segmentation with Incomplete Modalities".
Language:Python6 2 11
BEAM-Labs/CrossBind
Official Pytorch implementation of CrossBind: Collaborative Cross-Modal Identification of Protein Nucleic-Acid-Binding Residues.
Language:Python5 0 10
w1018979952/Audio-Visual-Matching
Voice Face Association Learning Paper List
1

cross-modality

jina-ai/clip-as-service

zai-org/CogVLM

KimMeen/Time-LLM

hangzhaomit/Sound-of-Pixels

layumi/Image-Text-Embedding

movienet/movienet-tools

haofanwang/awesome-conditional-content-generation

sail-sg/ptp

bismex/Awesome-cross-modality-person-re-identification

ZYK100/LLCM

Event-AHU/EventVOT_Benchmark

AnjanDutta/sem-pcyc

rhgao/co-separation

mangye16/Visible-Thermal-Person-Re-Identification

WinfredGe/T2S

AdityaLab/MM4TSA

JDAI-CV/CM-NAS

chenjingong/DN-ReID

M-3LAB/awesome-multimodal-brain-image-systhesis

workingcoder/MCJA

ZYK100/MMN

GuiyuZhao/VRHCF

catalina17/VideoNavQA

zjzsliyang/CrossLeak

heitorrapela/HalluciDet

JacobYuan7/OCN-HOI-Benchmark

Mithunjha/EarEEG_KnowledgeDistillation

Da1yuqin/TCDiff

MIS-DevWorks/FBR

PAGF188/RAXO

GWxuan/CL-Gait

Atmegal/Comprehensive-Distance-Preserving-Autoencoders-for-Cross-Modal-Retrieval

llcing/Cross-modal-hashing-SRLCH

mkang315/MCTSeg

BEAM-Labs/CrossBind

w1018979952/Audio-Visual-Matching