How to generate output images
kl2004 opened this issue · 4 comments
Thank you for releasing the inference code and model weights! 🚀
I have been experimenting with the script miniviewer.py
and noticed that it doesn't generate output images, even though there is an option called "Image Decoder Options". It was mentioned somewhere that image output is disabled by default. I am curious whether the released models were trained for image generation and how to enable them to generate output images.
https://ai.meta.com/blog/meta-fair-research-new-releases/
The models we’re releasing today were safety tuned and support mixed-modal inputs and text-only output to be used for research purposes. While we’ve taken steps to develop these models responsibly, we recognize that risks remain. At this time, we are not releasing the Chameleon image generation model.
Unfortunately, the public release cannot generate images in the output.
Can you clarify how this restriction is implemented? Is the model provided technically able to produce image tokens but just trained not to, or is there a piece of the architecture missing that is required to produce them?
To answer my own question, an independent lab has found a method to fine-tune the pre-trained Chameleon model for image generation: https://github.com/GAIR-NLP/anole
Guys, really, after one day fighting with dependencies installation, after the agreement and everything the model can not produce images! A huge waste of time.