/mllm-dpo

[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model

Primary LanguageJupyter Notebook

Watchers