EVA Series: Visual Representation Fantasies from BAAI
code for downloading videos from HowTo100M dataset
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.