DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
PythonBSD-3-Clause
Issues
- 5
- 0
配置文件位置在本地但是还是提示OSError: Can't load tokenizer for 'bert-base-uncased'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bert-base-uncased' is the correct path to a directory containing all relevant files for a BertTokenizer tokenizer.
#172 opened by Asmallsoldier - 2
Hugging Face demo runtime error
#143 opened by sihoseanhan - 1
modelling_llama.py
#166 opened by zeroQiaoba - 1
- 0
Issue in api endpoints
#170 opened by RAJA102002 - 0
- 1
Evaluation on large-scale dataset
#151 opened by hritam-98 - 0
Audio input
#168 opened by CHEN-H01 - 0
- 1
- 1
llm在两个阶段都是keep frozen吗?
#160 opened by Nastu-Ho - 0
Error loading the audio
#163 opened by xjr01 - 0
Finetune with LoRA and QLoRA
#162 opened by thisurawz1 - 0
finetune-billa7b-zh inference error shape '[-1, 136]' is invalid for input of size 137
#161 opened by len2618187 - 0
What if no frame_position_embeddings?
#158 opened by LetsGoFir - 2
Unable to launch demo
#149 opened by joysl - 2
how to increase the numbers of input frame?
#155 opened by onlyonewater - 1
- 0
.
#153 opened by advenTure423 - 0
Possible bugs in LR scheduler
#154 opened by SAGNIKMJR - 0
Compatibility b/w torch and torchvision?
#152 opened by shreyakannan1205 - 0
Is video-LLaMA capable of comprehending videos that have faces surrounded by bounding boxes(face recognition)
#150 opened by PhilipAmadasun - 1
Multiple Video-Text pair Support
#129 opened by mustafaadogan - 1
Frame-aware?
#142 opened by jayavanth - 0
如何提升下游任务上finetune的效果
#147 opened by Jinjikiko - 2
- 1
A demo without gradio
#140 opened by liboliba - 0
- 0
multi-cards training
#141 opened by gqsmmz - 2
关于environment.yml文件的问题
#120 opened by balabanahei - 0
example model deployment
#139 opened by nahidalam - 1
- 1
inf value occurs during forwarding process when fine-tuning VL branch with LLAVA-150K+MiniGPT4-3.5K+webvid-instruct
#138 opened by xuboshen - 0
Dear author, How much time does it cost to train this model? With what type of GPU cards?
#136 opened by zhangyuereal - 0
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM: size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]). size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
#135 opened by Amber0913 - 1
Very poor audio understanding
#134 opened by DumplingLife - 0
How to finetune video-llama using deepspeed?
#133 opened by tangyipeng100 - 0
Prompt
#132 opened by tobyperrett - 1
Hugging Face Spaces not working!
#131 opened by simmimak - 2
- 0
change the frames and query_tokens size
#128 opened by AllenFind - 1
Interesting prompt template
#126 opened by tian1327 - 1
Gradio does not work, stuck on uploading forever.
#127 opened by whoishoa - 0
Do you have any plans to open-source the pre-training and fine-tuning checkpoints based on Llama 2 Chinese version?
#125 opened by bjcodereview3 - 0
训练获取Dataloader中 的数据出错
#124 opened by Junphy-Jan - 1
how to run using LLaMA-2-chat?
#119 opened by tarunmis - 1
能否更新下README
#122 opened by Junphy-Jan - 0
如何部署LLama2训练出的video llama?
#121 opened by DimplesL - 0
请问训练设置val,这样正确吗
#118 opened by zhaozhipeng1997