/TextHawk

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Primary LanguagePython

Watchers