/VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Primary LanguagePythonApache License 2.0Apache-2.0

Watchers