Pinned Repositories
gan_dcic
palchenli.github.io
PTVD
short_ytb
VL-Instruction-Tuning
WebCam-LLaVA
PTVD.github.io
VL-Instruct
Codes for vision-language instruction tuning. Currently support BLIP2-t5 and BLIP2-vicuna.
TagGPT
TagGPT: Large Language Models are Zero-shot Multimodal Taggers
mllm-npu
mllm-npu: training multimodal large language models on Ascend NPUs