dhansmair/flamingo-mini
Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training
PythonMIT
Issues
- 3
Doubt about MaskedCrossAttention
#18 opened by eileenforwhat - 11
Doubts about training
#17 opened by TheMrguiller - 1
About the training script
#16 opened by kenchan0226 - 2
Text Prompting the model using the cached image
#14 opened by tutunarsl - 2
few shot example
#12 opened by shiv6891 - 4
- 3
Bugs of evaluation / caption generation
#13 opened by evelinehong - 2
Error on image_captioning.ipynb
#9 opened by tomasmadeira - 2
ImportError: cannot import name 'CLIPImageProcessor' from 'transformers' (/databricks/python/lib/python3.9/site-packages/transformers/__init__.py)
#10 opened by fionathrill - 0
- 4
Prompting example
#5 opened by samp830 - 0
Video Support
#8 opened by dhansmair - 0
- 0
generate() only works with use_cache=True
#6 opened by dhansmair - 0
- 0
parameters_trainable() and state_dict_trainable() do not include the token embedding matrix
#2 opened by dhansmair - 4
Web scraped data pretraining
#1 opened by edmondja