FreedomIntelligence/LongLLaVA
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
Python
Issues
- 1
Consider evaluating on LongVideoBench
#15 opened by teowu - 4
LongLLava-Med
#13 opened by LongYu-LY - 1
CT frame numbers and frame resolution
#14 opened by ruian1 - 2
repository not found, token with permission
#11 opened by adelightday - 2
additional auxiliary loss for moe?
#12 opened by maxin-cn - 1
- 4
device spec to run the inference
#8 opened by adelightday - 2
How to switch model from 13b to 9b
#10 opened by xuzukang - 2
the role of moe
#7 opened by maxin-cn - 1
There are many bugs when pip the requirement.txt as follows, making the code hard to run, can you provide more details?
#4 opened by Messi2013 - 1
- 2
- 1
Architecture of LongLLaVA
#6 opened by maxin-cn - 1