gyxxyg/VTG-LLM

直接推理出现问题

DQYZHWK opened this issue · 10 comments

bash eval.sh
启动脚本如下:

#!/bin/bash

DIR="VTG-LLM"
MODEL_DIR="/home1/lw/fyy/VTG-LLM/vtgllm.pth"


# TASK='dvc'
# ANNO_DIR='data/VTG-IT/dense_video_caption/Youcook2'
# VIDEO_DIR='data/youcook2/YouCook2_asr_denseCap/youcook2_6fps_224'
# DATASET='youcook'
# SPLIT='val'
# PROMPT_FILE="prompts/${TASK}.txt"
# GT_FILE="${ANNO_DIR}/${SPLIT}.caption_coco_format.json"


TASK='tvg'
ANNO_DIR="/home3/linwang/fyy/TimeChat/data/ours_annotation"
VIDEO_DIR="/home4/jiaxin/linwang/fyy/video/subset/"
DATASET='ours'
SPLIT='test'
PROMPT_FILE="prompts/mr.txt"
GT_FILE="${ANNO_DIR}/${SPLIT}.caption_coco_format.json"

# TASK='vhd'
# ANNO_DIR='data/VTG-IT/video_highlight_detection/QVHighlights'
# VIDEO_DIR='data/qvhighlights/videos/val'
# DATASET='qvhighlights'
# SPLIT='val'
# PROMPT_FILE="prompts/vhd.txt"
# GT_FILE="${ANNO_DIR}/highlight_${SPLIT}_release.jsonl"

NUM_FRAME=96
OUTPUT_DIR='output'
CFG_PATH=""


CUDA_VISIBLE_DEVICES=2 python evaluate.py --anno_path ${ANNO_DIR} --video_path ${VIDEO_DIR} --gpu_id 0 \
--task ${TASK} --dataset ${DATASET} --output_dir ${OUTPUT_DIR} --split ${SPLIT} --num_frames ${NUM_FRAME} --batch_size 1 \
--prompt_file ${PROMPT_FILE} --vtgllm_model_path ${MODEL_DIR} --cfg_path eval_configs/videollama-slot-96.yaml

cd metrics/${TASK}
python eval_${TASK}.py --pred_file "output/ours_predicate.json" --gt_file ${GT_FILE} | tee "output/ours_predicate.txt"
cd ../..

出现错误如下:
image

没用进行微调,直接拿来推理的,是不是提供的vtgllm.pth只能进行finetune,还是huggingface权重传错了,请求您的帮助?

image
我的cuda驱动是11.4,设备是A100,目前任然无法加载vtgllm的权重文件。

可以提供您使用的torch版本吗

当我切换至torch==2.2.0+cu121,目前在简单的模型加载成功了。 但是可能驱动还是11.4,需要更新

Traceback (most recent call last):
File "/home3/linwang/fyy/VTG-LLM/evaluate.py", line 434, in
main(args)
File "/home3/linwang/fyy/VTG-LLM/evaluate.py", line 245, in main
model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id))
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1152, in to
return self._apply(convert)
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 802, in _apply
module._apply(fn)
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 802, in _apply
module._apply(fn)
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 802, in _apply
module._apply(fn)
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 825, in _apply
param_applied = fn(param)
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1150, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/cuda/init.py", line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

image 我的cuda驱动是11.4,设备是A100,目前任然无法加载vtgllm的权重文件。

fix!!! 我尝试了torch==2.2.1+cu118,CUDA 驱动保持11.4,解决了问题。
当我切换至torch==2.2.0+cu121,目前在简单的模型加载成功了。 但是可能驱动还是11.4,需要更新

Traceback (most recent call last): File "/home3/linwang/fyy/VTG-LLM/evaluate.py", line 434, in main(args) File "/home3/linwang/fyy/VTG-LLM/evaluate.py", line 245, in main model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id)) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1152, in to return self._apply(convert) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 825, in _apply param_applied = fn(param) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1150, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "/home/jiaxin/miniconda3/envs/FastSAM/lib/python3.9/site-packages/torch/cuda/init.py", line 302, in _lazy_init torch._C._cuda_init() RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

image 我的cuda驱动是11.4,设备是A100,目前任然无法加载vtgllm的权重文件。

image 我的cuda驱动是11.4,设备是A100,目前任然无法加载vtgllm的权重文件。

I met this bug too, how can I fix it?

Please consider upgrading to a newer version of PyTorch, such as torch==2.1.2+cu121. Additionally, please ensure that the CUDA version is compatible with your device.

you ckpt is not universal enough. And when I used your requirements-v100.txt file to configure the environment, it caused a lot of conflicts.

The requirements-v100.txt file is directly exported from our CUDA environments. You may want to try running bash install_requirements-v100.sh, as this typically works in most cases. Also, torch==2.1.2+cu121 is recommended.

P.S. This project is built upon VideoLlama and TimeChat, which utilize older training frameworks. We also find it annoying, and we are currently working on training better models using an improved framework. Stay tuned for updates.

ok. I have ran the code and get the url for webserver. but the model's answer is irrelevant to the question. Can you provide some examples or demonstrations of Q&A.