利用现成的语言和图片生成及鉴别模型来接受中文输入输出图片。
- 中文到英文采用 HuggingFace Helsinki-NLP
- 英文到图片生成采用 VQGAN
- 图片鉴别使用OpenAI的CLIP
- VQGAN和CLIP合起来有一个pixray project
- https://medium.com/towards-data-science/how-i-built-an-ai-text-to-art-generator-a0c0f6d6f59f
- https://colab.research.google.com/github/justinjohn0306/VQGAN-CLIP/blob/main/VQGAN%2BCLIP_%28z%2Bquantize_method_with_augmentations%2C_user_friendly_interface%29.ipynb
- https://colab.research.google.com/github/dribnet/clipit/blob/master/demos/Swap_Model.ipynb