sshh12/multi_token
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
PythonApache-2.0
Issues
- 1
Error when running AudioWhisper inference
#26 opened by setianke - 0
GGUF Support?
#27 opened by yukiarimo - 2
M1 support?
#23 opened by yukiarimo - 1
OpenAI Client for Serving
#25 opened by SulRash - 1
Fine tuning LLAVA for object detection
#24 opened by dipikakhullar - 6
Cannot compile adapter_model.bin?
#22 opened by kuki2008 - 0
No module named 'imagebind'
#21 opened by kuki2008 - 6
Training with no pretrained encoder - just projection from ready embeddings
#20 opened by tehila17-meet - 3
How train mixtral MoE ?
#18 opened by tommarques56 - 1
Multi GPU
#19 opened by tehila17-meet - 2
is the training data available?
#17 opened by tanganke - 1
Supported Base Models
#16 opened by DhruvSinghiitmandi - 1
Summarize video
#15 opened by linchen111 - 4
pretrain errors
#14 opened by linchen111 - 9
Adapter weights not found
#12 opened by DeuceOfClubs - 0
HFValidationError
#13 opened by linchen111 - 3
- 4
Thanks for the great work
#6 opened by codybum - 3
Thank you for posting this!
#10 opened by matbee-eth - 3
- 1
wait, what is that in my training?
#8 opened by guilh00009 - 7
theres nothing in my output
#7 opened by guilh00009 - 2
Finetuning already trained model
#5 opened by Aniketto16 - 3
Multiple Image QA Model
#4 opened by tsdocode