Question about the settings for tokenizer and frames_ops of data_preprocess in configs/sample_config.yaml
Xuchen-Li opened this issue · 2 comments
Hello, sorry for bothering you again.
data_preprocess:
with_visual: True
frames_key: frames
sample_method: random_clip
label_key: "vqa"
task_type: vqa
tokenizer: "item"
max_seq_len: 512
max_prompt_len: 256
vqa_processor_params:
box_format: ours_v1
online_vqa_processor_params:
task: SOT
num_segments: 1
verbose: True
training: False
frames_ops:
Resize:
size: [336, 336]
ToTensor: {}
Normalize:
mean: [0.48145466, 0.4578275, 0.40821073]
std: [0.26862954, 0.26130258, 0.27577711]
I am wondering about the settings for tokenizer and frames_ops in configs/sample_config.yaml for
if self.with_visual:
if isinstance(frames_ops, str):
self.video_processor = AutoImageProcessor.from_pretrained(frames_ops)
else:
self.video_processor = VisionProcessor(frames_ops)
and
local_path = tokenizer
self.tokenizer = AutoTokenizer.from_pretrained(
local_path, use_fast=False, trust_remote_code=trust_remote_code
)
in eval/data/video_llm_data.py line 98 - 103 and line 123 - 128.
How to load video_processor and tokenizer from the pretrained model as setting in configs/sample_config.yaml.
Thanks a lot!
Thanks for your attention!
The tokenizer path corresponds to the path of the LLM, such as Llama2's path. There is no need to modify frames_ops in config.yaml unless you want to use a different processor, such as CLIP's official processor. If you prefer to use CLIPViT or Siglip's official processor, simply set frames_ops to {path/to/clipvit} or {path/to/siglip}.
Thanks for your attention!
The tokenizer path corresponds to the path of the LLM, such as Llama2's path. There is no need to modify frames_ops in config.yaml unless you want to use a different processor, such as CLIP's official processor. If you prefer to use CLIPViT or Siglip's official processor, simply set frames_ops to {path/to/clipvit} or {path/to/siglip}.
Thanks a lot!