[Colab] Text embedding optimization
ifilipis opened this issue · 0 comments
Input image: https://static8.depositphotos.com/1370441/848/i/600/depositphotos_8486144-stock-photo-beach-and-tropical-sea.jpg
Input text: 'elon musk'
Colab that runs out of memory: https://colab.research.google.com/drive/1ancv6fQMrzaz67Ikvfv3wnjlwpWsoebO?usp=sharing
My method is to optimize the text embedding of the transformer, in order to make the output closer to the input image. Same thing as fine-tuning, but optimizing text embeddings, instead of model weights. I had to modify model's forward pass to make it retain the gradient. Sorry for the messy code
Also, I wonder if it's possible to generate the same picture every time? This may be a way to do text-based image modification. I tried removing temperature and filtering, didn't help. Seed is always the same(presumably).