video-language-model

There are 4 repositories under video-language-model topic.

  • Coobiw/MPP-LLaVA

    Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

    Language:Jupyter Notebook43263623
  • TencentARC/ST-LLM

    [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"

    Language:Python1456225
  • patrick-tssn/VideoHallucer

    VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

    Language:Python27420
  • moucheng2017/SOP-LVM-ICL-Ensemble

    [NeurIPS VLM workshop 2024] In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Understanding

    Language:Python22373