cross-modal-learning

There are 21 repositories under cross-modal-learning topic.

KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Language:Python1.6k 17 154271
whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Language:Python249 9 3320
MohamedAfham/CrossPoint
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)
Language:Python243 7 2228
whwu95/Text4Vis
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
Language:Python205 7 2315
whwu95/BIKE
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Language:Python162 12 2421
choyingw/Cross-Modal-Perceptionist
CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Language:Python127 8 815
Toytiny/CMFlow
[CVPR 2023 Highlight 💡] Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
Language:Python122 3 1613
RunpeiDong/ACT
[ICLR 2023] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Language:Python102 5 85
mako443/Text2Pos-CVPR2022
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
Language:Python40 3 127
knightyxp/DGL
[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.
Language:Python32 1 41
frank-chris/ImageTextRetrieval
In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on image-text retrieval on a fashion clothing dataset.
Language:Jupyter Notebook11 1 01
Markin-Wang/CAMANet
[IJBHI 2023] This is the official implementation of CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation accepted to IEEE Journal of Biomedical and Health Informatics (J-BHI), 2023.
Language:Python8 2 10
verlab/StraightToThePoint_CVPR_2020
Original PyTorch implementation of the code for the paper "Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data" at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
Language:Python8 4 22
IGITUGraz/MemoryDependentComputation
Code for Limbacher, T., Özdenizci, O., & Legenstein, R. (2022). Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity. arXiv preprint arXiv:2205.11276.
Language:Python6 6 13
codiceSpaghetti/T4SA-2.0
This project creates the T4SA 2.0 dataset, i.e. a big set of data to train visual models for Sentiment Analysis in the Twitter domain using a cross-modal student-teacher approach.
Language:Jupyter Notebook4 1 02
PrithivirajDamodaran/WhatTheFood
An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.
4 2 00
GaochangWu/FMF-Benchmark
This is a cross-modal benchmark for industrial anomaly detection.
30
Qwinpin/DanceBERT-Masked-Motion-Modeling
Language:Jupyter Notebook3 2 00
kjanjua26/Do_Cross_Modal_Systems_Leverage_Semantic_Relationships
This is the code for our ICCV'19 paper on cross-modal learning and retrieval.
1 2 01
basiclab/TrajPrompt
[ECCV 2024] Official Implementation of "TrajPrompt: Aligning Color Trajectory with Vision-Language Representations"
0 1 00
TataMoktari/CrossModal_GAN
We design a cross-modal GAN which learns image-to-image modality transformation across cross-domain. This network is able to synthesize Infrared images from VISIBLE images for VEDAI dataset
Language:Python0 1 00

cross-modal-learning

KimMeen/Time-LLM

whwu95/Cap4Video

MohamedAfham/CrossPoint

whwu95/Text4Vis

whwu95/BIKE

choyingw/Cross-Modal-Perceptionist

Toytiny/CMFlow

RunpeiDong/ACT

mako443/Text2Pos-CVPR2022

knightyxp/DGL

frank-chris/ImageTextRetrieval

Markin-Wang/CAMANet

verlab/StraightToThePoint_CVPR_2020

IGITUGraz/MemoryDependentComputation

codiceSpaghetti/T4SA-2.0

PrithivirajDamodaran/WhatTheFood

GaochangWu/FMF-Benchmark

Qwinpin/DanceBERT-Masked-Motion-Modeling

kjanjua26/Do_Cross_Modal_Systems_Leverage_Semantic_Relationships

basiclab/TrajPrompt

TataMoktari/CrossModal_GAN