hb-jw

Pinned Repositories

long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Language:Python388 5 1928
DockerTarBuilder
它是一个工作流。可快速构建指定架构/平台的docker镜像
00
q_former
0 1 00
vicana-13b-pth
0 1 00
LLaVA-NeXT
Language:Python3.1k 37 327274
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Language:Python1.3k 15 122110
VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Language:Python2.2k 37 142181
Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Python3.7k 30 437233
MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
Language:Python570 12 4261
VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Language:Python1k 40 5763

hb-jw's Repositories

hb-jw/DockerTarBuilder
它是一个工作流。可快速构建指定架构/平台的docker镜像
hb-jw/vicana-13b-pth
hb-jw/q_former