LH019

Pinned Repositories

LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.7k 159 1.6k2.3k
MetaTransformer
Meta-Transformer for Unified Multimodal Learning
Language:Python1.5k 22 68115
ChatBridge
ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.
Language:Python48 2 71
DARNet
00
Multi-Agents
0 1 00
student_mis
Language:Vue00
testGit
试试git
00
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.1k 97 674977
MMSA-FET
A Tool for extracting multimodal features from videos.
Language:Python145 6 4521

LH019's Repositories