VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Primary LanguagePythonApache License 2.0Apache-2.0