AIDC-AI/Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

PythonApache-2.0

Issues

Grounding Ability
#43 opened 14 days ago by zizaisuiyuan
0
How to correctly input image-free prompt?
#37 opened 16 days ago by HaiFangfang
1
Visual Tokenizer: IMAGE_ATOM_ID
#42 opened 17 days ago by KyujinHan
0
Traning script and data_info_v1.6 will be open source?
#23 opened 3 months ago by liuheng0111
3
A few questions about `train.py`
#41 opened 22 days ago by KyujinHan
0
how to create a custom dataset ?
#40 opened 25 days ago by himasai9712
0
Why use different dataset for Training Ovis1.5-Gemma2-9B-S3 and Ovis1.5-Llama3-8B-S3
#39 opened a month ago by LIRENDA621
0
Are you planning to release a small version for edge devices like Jetson AGX Orin?
#38 opened a month ago by ALFONSOBUGRA
0
When will the 27B Gemma version model be released?
#34 opened a month ago by ALFONSOBUGRA
2
Any plan about support video and audio?
#35 opened 2 months ago by thesby
1
Ovis1.6-Llama3.2-3B-GPTQ-Int4. How can it be inferred using the CPU?
#36 opened 2 months ago by peterlong2003
1
Finetune Ovis 1.6 with LoRA
#27 opened 3 months ago by anhquannguyen21
2
running on 4bit model
#29 opened 3 months ago by haiderasad
2
Can I use this model with the CPU？
#32 opened 2 months ago by peterlong2003
0
How to use multiple GPUs for inference?
#12 opened 4 months ago by waltonfuture
8
ValueError: Model architectures ['Ovis'] are not supported for now.
#33 opened 2 months ago by yangxin60-tal
0
If it is possible to run inference with OVIS 1.6 on a single 4090 GPU?
#22 opened 3 months ago by Raven625
6
Run into 'HybridCache' object has no attribute 'max_batch_size' error when doing inference
#31 opened 3 months ago by dustinjoe
3
内存不断增加, 内存泄漏
#30 opened 3 months ago by JianbangZ
1
Missing datasets
#28 opened 3 months ago by Hutaf
0
Can I identify and analyze videos? How to input video? Do you have any examples
#25 opened 3 months ago by libai-lab
1
About the equivalence and a slightly more complex MLP connection
#5 opened 5 months ago by lucasjinreal
12
The size of tensor a (3084) must match the size of tensor b (3085) at non-singleton dimension 3
#18 opened 3 months ago by longkeyy
1
training script
#20 opened 3 months ago by ecnuycxie
2
Failed to process batch: Currently, only support `batch_size=1`
#19 opened 3 months ago by Phoenix724
5
支持输入多张image吗，支持的话能给个示例吗
#24 opened 3 months ago by yangxin60-tal
1
What does IMAGE_INDICATOR_IDS mean?
#21 opened 3 months ago by SamaelChen
1
about license
#17 opened 3 months ago by ecnuycxie
2
The training effect is poor using official data and code
#15 opened 3 months ago by liuheng0111
1
How to finetune the model？
#16 opened 3 months ago by cyj95
1
> We have uploaded the images for `ai2d-mc-15k` to [Huggingface](https://huggingface.co/datasets/AIDC-AI/Ovis-dataset/blob/main/images/ai2d-mc-15k.zip).
#11 opened 3 months ago by liuheng0111
1
can not find ocr-469k images
#13 opened 4 months ago by luyao-cv
1
hwl-eng-10k 的jsonl文件是哪个呀？
#14 opened 4 months ago by luyao-cv
1
About the dataset
#10 opened 4 months ago by lucasjinreal
1
Some dataset can not download
#9 opened 4 months ago by liuheng0111
2
Is SoftMax Computation-Intensive?
#8 opened 4 months ago by wusize
1
有没有完整的数据集
#7 opened 5 months ago by liuheng0111
2
The training strategy in the Ablation Study
#6 opened 5 months ago by ruxin123
1
数据集访问不到
#1 opened 5 months ago by liuheng0111
5
Missing *.parquet files in https://huggingface.co/datasets/AIDC-AI/Ovis-dataset/tree/main/meta_files/v1_5
#3 opened 5 months ago by SeuTao
1
Prompts used to build In-house data
#2 opened 5 months ago by zjy-ucas
2
Missing metadata files in v1.5
#4 opened 5 months ago by kakao-logan-c
2