dvlab-research/LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

PythonApache-2.0

Issues

About ZERO3
#75 opened 2 months ago by xxtars
7
training loss in stage-1
#88 opened a month ago by Nastu-Ho
0
code details
#87 opened a month ago by Nastu-Ho
0
Extract context relevancy
#86 opened 2 months ago by IgnacioSan22
0
KeyError: 'LlavaConfig'
#85 opened 2 months ago by skyol99
0
How to resume the checkpoint to continue pretraining？
#84 opened 2 months ago by Einstone-rose
0
About the WebVid dataset
#83 opened 2 months ago by szbcasia
0
why not use LoRA for tunning Vicuna?
#72 opened 2 months ago by dragen1860
1
Confusion in pre-process images for long video
#77 opened 2 months ago by zhuqiangLu
0
Are all video-based checkpoints trained with 2 tokens?
#82 opened 2 months ago by haodi19
0
HF model format : vlm weights not in llama-vid-7b-full-336
#81 opened 2 months ago by nileshkokane01
0
Zero-3 offload support
#60 opened 4 months ago by XenonLamb
5
About the json in stage2 and stage3
#79 opened 2 months ago by liziming5353
1
Questions about Text Decoder and Text Query
#80 opened 2 months ago by SeuXiao
0
about the context length for long video
#78 opened 2 months ago by zhuqiangLu
0
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument weight in method wrapper_CUDA__native_layer_norm)
#76 opened 2 months ago by daocodedao
1
An error occurs during the stage 2 fine-tuning
#74 opened 3 months ago by ShuoZhang2003
1
AttributeError: 'NoneType' object has no attribute 'is_loaded'
#73 opened 3 months ago by sykuann
1
Multi-image inference
#71 opened 3 months ago by g-h-chen
1
Requirements needed for inferring llama-vid llama-vid-13b-full-224-video-fps-1
#69 opened 3 months ago by sykuann
1
abnormal outputs for llama-vid-7b-full-224-video-fps-1 ckpt
#68 opened 3 months ago by YulongBonjour
1
error: llava key
#64 opened 4 months ago by menahem-borges-rodrigues
1
Sharing training loss
#59 opened 4 months ago by Deaddawn
2
Computation costs for each stage?
#70 opened 3 months ago by Becomebright
1
How to change default path for model_zoo
#67 opened 3 months ago by sykuann
2
Questions about the subtitles.
#66 opened 3 months ago by Yxxxb
1
Long video dataset (only available 167 movies)
#62 opened 3 months ago by KerolosAtef
2
Long Video dataset
#61 opened 4 months ago by eslambakr
1
flash-attn
#65 opened 4 months ago by ismailukman
2
About evaluation on vqav2 dataset
#63 opened 4 months ago by liziming5353
1
Incomplete evaluation on MSVD-QA dataset.
#52 opened 5 months ago by XenonLamb
5
About text encoder
#51 opened 4 months ago by liziming5353
3
MSVD ACC decrease after stage3
#58 opened 4 months ago by Deaddawn
3
自定义长视频完全跑不了
#54 opened 4 months ago by TotoroDHL
1
Logic error in code: img_in_text and img_token not in sentence["value"]
#50 opened 5 months ago by dragen1860
3
The GPU's graphics card usage is also constantly increasing,
#57 opened 5 months ago by kunkunsheng
3
is eva_vit_g.pth trained by yourself?
#56 opened 5 months ago by Deaddawn
1
why stage 1 and 2 use differenct ` --version plain_guided ` ` --version imgsp_v1 ` parameters?
#55 opened 5 months ago by dragen1860
1
Enquiry on Download Permission
#53 opened 5 months ago by HenryHZY
2
A question in stage3
#45 opened 5 months ago by liziming5353
2
two types of tokenizer?
#43 opened 5 months ago by dragen1860
3
multiple json for training?
#39 opened 5 months ago by dragen1860
1
When can the Customed Long Video Gradio Web UI be released？
#40 opened 5 months ago by QiSu77
4
Long Video CLI wrong
#48 opened 5 months ago by QiSu77
2
is the LLM weight trainable during stage1-2-3?
#49 opened 5 months ago by dragen1860
1
why delay_load in build_vision_tower(config, delay_load=True)?
#47 opened 5 months ago by dragen1860
1
why different architectures in stage2 and stage3?
#46 opened 5 months ago by dragen1860
1
stage 2: freezing the visual encoder?
#44 opened 5 months ago by dragen1860
1
you build `build_vision_tower` twice?
#42 opened 5 months ago by dragen1860
1
what does `lazy_preprocess` mean?
#41 opened 5 months ago by dragen1860
1