language-vision

There are 5 repositories under language-vision topic.

unum-cloud/uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Language:Python1k 15 2961
JacobYuan7/RLIPv2
[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Language:Python112 2 223
Fsoft-AIC/Language-Conditioned-Affordance-Pose-Detection-in-3D-Point-Clouds
[ICRA 2024] Language-Conditioned Affordance-Pose Detection in 3D Point Clouds
Language:Python17 2 32
CharlesYang030/MTA
MTA: A Lightweight Multilingual Text Alignment Model for Cross-language Visual Word Sense Disambiguation
Language:Jupyter Notebook1 1 10
ElDokmak/MultiModal-Models
Hands on some MultiModal Models
Language:Jupyter Notebook0 1 00