ai-forever/ru-dalle

[Colab] Text embedding optimization

ifilipis opened this issue · 0 comments

Input image: https://static8.depositphotos.com/1370441/848/i/600/depositphotos_8486144-stock-photo-beach-and-tropical-sea.jpg

Input text: 'elon musk'

Result: image and image

Colab that runs out of memory: https://colab.research.google.com/drive/1ancv6fQMrzaz67Ikvfv3wnjlwpWsoebO?usp=sharing

My method is to optimize the text embedding of the transformer, in order to make the output closer to the input image. Same thing as fine-tuning, but optimizing text embeddings, instead of model weights. I had to modify model's forward pass to make it retain the gradient. Sorry for the messy code

Also, I wonder if it's possible to generate the same picture every time? This may be a way to do text-based image modification. I tried removing temperature and filtering, didn't help. Seed is always the same(presumably).