NVlabs/VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

PythonApache-2.0

Issues

What is the conv_mode for VILA-1.5-3B ?
#103 opened a month ago by amitbcp
1
Expected Release Date for VILA^2 Model and Code
#124 opened 4 months ago by SZUHvern
1
ValueError: The checkpoint you are trying to load has model type `llava_llama` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
#135 opened a month ago by eternal8080
14
Context size and examples for LongVILA
#141 opened 3 months ago by yulinzou
1
KeyError: 'llava_llama'
#138 opened 3 months ago by RajaAIStarter
1
cannot download dataset
#139 opened a month ago by henrycjh
1
Unable to run Gradio demo: VILA with TinyChat
#146 opened 2 months ago by mitraavi
1
How to get the stage 2 checkpoint path for 3_sft.sh
#143 opened 2 months ago by Qnancy
5
What is the conv_mode for VILA1.5-40b in video inference?
#145 opened 2 months ago by stdKonjac
1
Fine-tuning LongVILA
#140 opened 3 months ago by lyluh
2
Repetitive Output in LongViLa-LLama3-1024Frames
#149 opened a month ago by hb-jw
1
How to run longvila large context, sequence parallel inference?
#130 opened 4 months ago by zadeismael
20
How to change openai inference call to docker for video?
#148 opened a month ago by cholland-nv
0
Docker setup gets error
#147 opened a month ago by cholland-nv
1
How to inference with AWQ in linux shell.
#144 opened a month ago by GitMonkey0
0
Is the 2D Seq parallelism equivalent to the USP in LongVILA?
#116 opened 4 months ago by hijkzzz
1
Dataset and Training code for Longvila
#132 opened 4 months ago by JcWang20
3
TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len' when running VILA model inference
#126 opened 4 months ago by LanceLeonhart
4
Long context video module only
#142 opened 3 months ago by MH-Python
0
VILA-1.5-HD coming soon?
#137 opened 3 months ago by collinmccarthy
1
About sharegpt_video. How do you make video file from jpeg images?
#113 opened 4 months ago by osttkm
2
About the inference on video.
#134 opened 3 months ago by trinhvg
1
create long-video QA samples
#121 opened 4 months ago by peiliu0408
3
Issue with Flash Attention on V100 GPU for Llama-3-VILA1.5-8B Model
#109 opened 5 months ago by vedernikovphoto
8
how to run VILA1.5-40B-AWQ
#125 opened 4 months ago by chenxinhua
1
Trying to run VILA on triton with triton_llm backend
#133 opened 4 months ago by dand-milestone
0
Where is the server.py script?
#131 opened 4 months ago by zixinglin07
0
[HELP] Do we have any docker image for Jetson platform ?
#111 opened 4 months ago by lenoardshannon
2
How to run vila with TinyChatEngine with multiple understanding enabled?
#129 opened 4 months ago by yg1988
0
Fine tuning and --evaluation_strategy argument
#122 opened 4 months ago by lyluh
1
Can VILA do grounding jobs?
#128 opened 4 months ago by PredyDaddy
1
Plz fix run_vila.py line 65 output variable(s)
#127 opened 4 months ago by ziyaosg
0
Data preparation for Stage 4 and Stage 5 in LONGVILA
#119 opened 4 months ago by GenjiB
0
LongVILA - compatibility with other LLMs
#115 opened 4 months ago by orrzohar
1
COYO-700M Dataset Download Script Error
#107 opened 4 months ago by XuGW-Kevin
3
Support for multi-video captioning with multiple grid image inputs?
#96 opened 4 months ago by YoungjaeDev
2
How to convert model to gguf
#93 opened 4 months ago by dand-milestone
3
No training scripts in scripts/v1_5/paper/
#112 opened 4 months ago by yhyang123
1
Whether the visual encoder participates in training
#95 opened 4 months ago by LoverLost
3
Image text retrieval support
#106 opened 4 months ago by lhchau
1
Support VILA with lmdeploy
#105 opened 4 months ago by cmpute
1
[Help] Using VILA1.5-40b model for Video Descriptions
#110 opened 5 months ago by SidPad03
1
Is there any way to increase the context window?
#100 opened 5 months ago by ZackBradshaw
4
Deployment to SageMaker and/or HuggingFace Inference Endpoints Fails With Error
#94 opened 5 months ago by averypfeiffer
5
Question re. LanguageModel vs LanguageModelForCausalLM functionalies
#101 opened 5 months ago by orrzohar
2
AttributeError: 'Image' object has no attribute 'shape'
#104 opened 5 months ago by ZackBradshaw
6
question: what does 'repack_multimodal_data' function do?
#98 opened 5 months ago by orrzohar
1
release schedule for the "VILA1.5-34b-4bit-AWQ" model.
#99 opened 5 months ago by xiexiaoshinick
1
Multi-Image or Multi-Video Inference Example
#97 opened 5 months ago by chancharikmitra
2
Llama2 or Llama3
#102 opened 5 months ago by amitbcp
0