joez17

PhD Student from CASIA

Beijing

Pinned Repositories

Compressed-Video-Reader
A video reader for extracting motion vectors and residuals from encoded H.264 videos.
Language:C21 1 73
alpaca-lora
Instruct-tune LLaMA on consumer hardware
Language:Jupyter Notebook0 0 00
ChatBridge
ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.
Language:Python53 2 81
ChatSearch
ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
6 2 00
LLaVA-NeXT-kv
Language:Python00
VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
Language:Python46 1 50
Kangaroo
official impelmentation of Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Language:Python67 3 80
DyCoke
DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
Language:Python40 2 50
LAMM
[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
Language:Python317 9 4416

joez17/ChatBridge
ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.
Language:Python53 2 81
joez17/VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
Language:Python46 1 50
joez17/ChatSearch
ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
6 2 00
joez17/alpaca-lora
Instruct-tune LLaMA on consumer hardware
Language:Jupyter Notebook0 0 00
joez17/LLaVA-NeXT-kv
Language:Python00