PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
PythonApache-2.0
Issues
- 56
- 0
- 1
missing file: preprocessor_config.json
#186 opened by JunanPan - 1
Multi-GPU inference problem.
#179 opened by jiazheng-xing - 0
Hardware Requirement for the model to run in LORA
#194 opened by leochang123 - 1
Videochatgpt tuning data encounters some error
#193 opened by Lexarymade - 0
Can you Fix the DEMO. Demo is no longer working
#192 opened by thisurawz1 - 1
Pretrain and Finetune template versions
#189 opened by xin-li-67 - 4
ImportError: cannot import name '_expand_mask' from 'transformers.models.clip.modeling_clip'
#184 opened by qiuchen001 - 1
Unable to install flash attn module
#143 opened by anantalp - 0
Can't reproduce results on MSRVTT and MSVD dataset
#191 opened by 1999Lyd - 3
Size mismatch error when running locally.
#152 opened by ssuncheol - 0
Issues with Converting the video-llava Model to ONNX
#190 opened by Ark1a - 5
训练时报错AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
#144 opened by Qinger27 - 0
When I evaluated the ‘TGIF_Zero_Shot_QA’ dataset, the accuracy was only 13%. Should I train first to achieve the 70% accuracy in the paper?
#188 opened by FanshuoZeng - 0
- 0
Can this model apply a few-shot when inference?
#185 opened by Ijustakid - 0
Valley video not found during pretraining.
#182 opened by Aakriti05 - 0
- 0
Questions about LanguageBind Usage
#180 opened by lingjunzhao - 2
Issues with finetune_lora.sh
#171 opened by shag1802 - 0
Request for Inference Parameters on VideoLLava
#178 opened by adrianwestmoon - 0
Is it possible to train with languages other than English, and are the 8 frames sampled uniformly across different video lengths?
#177 opened by YoungjaeDev - 3
error:RuntimeError: Error(s) in loading state_dict for CLIPVisionModel: size mismatch for vision_model.embeddings.class_embedding: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
#175 opened by zapqqqwe - 0
size mismatch
#176 opened by cs19469 - 0
total_frames zero error
#174 opened by OliverLeeXZ - 0
pretrained checkpoint
#173 opened by OliverLeeXZ - 0
How to increase the sample frames amount
#172 opened by sherlock666 - 1
Video-LLava Upgradation
#164 opened by Tortoise17 - 2
- 0
Error with Gradio Client: Please Upgrade Gradio to 4.x and Redeploying HuggingFace Space
#170 opened by zhanwenchen - 0
Question Regarding Video Frame Processing
#169 opened by Kkkaystone - 0
- 0
extremely slow with transformers
#167 opened by RaulKite - 0
Training help
#166 opened by felmoreno1726 - 0
About class embedding
#165 opened by feiyu12138 - 1
- 0
- 0
Inference model path unclear
#161 opened by Ali2500 - 0
Please specifiy library versions
#159 opened by nahidalam - 1
Uri validation issue on Replicate
#157 opened by Gab1988 - 0
The problem about the environment
#154 opened by swiftCC - 0
Some weights of the model checkpoint at "./Video-LLaVA-7B" were not used when initializing LlavaLlamaForCausalLM:
#153 opened by ssuncheol - 0
how to load pretrained weight on local (offline)?
#150 opened by jusepv - 0
Warnings about weights, temperature, top_p, and embedding layer, but it still works. Should I worry about them?
#149 opened by secretlycarl - 0
Impossible to install on windows
#148 opened by secretlycarl - 1
推理多张图片时报错 IndexError: list index out of range
#146 opened by Qinger27 - 0
- 0
为什么loss一直为0
#141 opened by xienan0326 - 0
About contrastive learning
#140 opened by mjkmain