Can we do the only text, image and text and video and text finetuning with lora in a one run
thisurawz1 opened this issue · 4 comments
Can we do only text, image-text, and video-text finetuning with Lora in one run? I mean, put only text, text-image, and video-image samples in the same custom.json file and do the fine-tuning?
Yes, you can. The LazysupervisedDataset
in train.py
unifies the processing of pure text
, image-text
, video-text
data sample.
Can we do only text, image-text, and video-text finetuning with Lora in one run? I mean, put only text, text-image, and video-image samples in the same custom.json file and do the fine-tuning?
@thisurawz1 , can you share a sample format JSON which has only text, text-image, and video-image which you are using for finetuning
Can we do only text, image-text, and video-text finetuning with Lora in one run? I mean, put only text, text-image, and video-image samples in the same custom.json file and do the fine-tuning?
@thisurawz1 , can you share a sample format JSON which has only text, text-image, and video-image which you are using for finetuning
image-text - id 0
image-video - id 1
text only - id 3
[
{
"id": 0,
"image": "images/xxx.jpg",
"conversations": [
{
"from": "human",
"value": "<image>\nWhat are the colors of the bus in the image?"
},
{
"from": "gpt",
"value": "The bus in the image is white and red."
}
]
},
{
"id": 1,
"video": "videos/xxx.mp4",
"conversations": [
{
"from": "human",
"value": "<video>\nWhat are the main activities that take place in the video?"
},
{
"from": "gpt",
"value": "The main activities that take place in the video are the preparation of camera equipment by a man, a group of men riding a helicopter, and a man sailing a boat through the water."
}
],
},
{
"id": 3,
"model": "",
"conversations": [
{
"from": "human",
"value": "What are the colors of the bus in the image?"
},
{
"from": "gpt",
"value": "The bus in the image is white and red."
}
]
}
]
folder structure
VideoLLaMA2
├── datasets
│ ├── custom_sft
│ | ├── images
│ | ├── videos
| | └── custom.json
Thanks @thisurawz1