In this work, the authors show that a text-conditioned diffusion model trained on pixel representations of images can be used to generate SVG-exportable vector graphics.
official website: https://vectorfusion.github.io/
- [01/2024] 🔥 We released the SVGDreamer. SVGDreamer is a novel text-guided vector graphics synthesis method. This method considers both the editing of vector graphics and the quality of the synthesis.
- [12/2023] 🔥 We released the PyTorch-SVGRender. Pytorch-SVGRender is the go-to library for state-of-the-art differentiable rendering methods for image vectorization.
- [10/2023] 🔥 We released the DiffSketcher code. A method of synthesizing vector sketches by text prompts.
- [10/2023] 🔥 We reproduce the VectorFusion code.
Create a new conda environment:
conda create --name vf python=3.10
conda activate vfInstall pytorch and the following libraries:
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
pip install omegaconf BeautifulSoup4
pip install shapely
pip install opencv-python scikit-image matplotlib visdom wandb
pip install triton numba
pip install numpy scipy timm scikit-fmm einops
pip install accelerate transformers safetensors datasetsInstall CLIP:
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.gitInstall diffusers:
pip install diffusers==0.20.2Install xformers (require python=3.10):
conda install xformers -c xformersInstall diffvg:
git clone https://github.com/BachiLi/diffvg.git
cd diffvg
git submodule update --init --recursive
conda install -y -c anaconda cmake
conda install -y -c conda-forge ffmpeg
pip install svgwrite svgpathtools cssutils torch-tools
python setup.py installdocker run --name vectorfusion --gpus all -it --ipc=host ximingxing/svgrender:v1 /bin/bashPrompt: the Sydney Opera House.
Style: iconography
Preview:
| (a) Sample raster image with Stable Diffusion | (b) Convert raster image to a vector via LIVE | (c) VectorFusion: Fine tune by LSDS |
LIVE Rendering Process:
| iter 0 | iter 500 | iter 1000 | iter 1500 | iter 2500 | iter 3500 |
|---|---|---|---|---|---|
VectorFusion Rendering Process:
| iter 0 | iter 100 | iter 300 | iter 400 | iter 700 | iter 1000 |
|---|---|---|---|---|---|
Script:
python run_painterly_render.py \
-c vectorfusion.yaml \
-pt "the Sydney Opera House. minimal flat 2d vector icon. lineal color. on a white background. trending on artstation" \
-save_step 50 \
-update "K=6" \
-respath ./workdir/SydneyOperaHouse \
-d 15486 \
--download-ca.k.a--config: configuration file.-save_step: the step size used to save the result (too frequent calls will result in longer times).-update: a tool for editing the hyper-params of the configuration file, so you don't need to create a new yaml.-pta.k.a--prompt: text prompt.-respatha.k.a--results_path: the folder to save results.-da.k.a--seed: random seed.--download: download models from huggingface automatically when you first run them.
optional:
-npt, a.k.a--negative_prompt: negative text prompt.-mv, a.k.a--make_video: make a video of the rendering process (it will take much longer).-frame_freq, a.k.a--video_frame_freq: the interval of the number of steps to save the image.-framerate, a.k.a--video_frame_rate: control the playback speed of the output video.
Prompt: A photo of a Ming Dynasty vase on a leather topped table.
Style: iconography
Preview:
| (a) Sample raster image with Stable Diffusion | (b) Convert raster image to a vector via LIVE | (c) VectorFusion: Fine tune by LSDS |
Script:
python run_painterly_render.py -c vectorfusion.yaml -pt "A photo of a Ming Dynasty vase on a leather topped table. minimal flat 2d vector icon. lineal color. on a white background. trending on artstation" -save_step 50 -respath ./workdir/vase -d 683692Prompt: An astronaut figure.
Style: iconography
Preview:
| (a) Sample raster image with Stable Diffusion | (b) Convert raster image to a vector via LIVE | (c) VectorFusion: Fine tune by LSDS |
Script:
python run_painterly_render.py -c vectorfusion.yaml -pt "An astronaut figure. minimal flat 2d vector icon. lineal color. on a white background. trending on artstation" -save_step 50 -respath ./workdir/astronaut -d 522178Prompt: Electric guitar.
Style: Pixel-Art
Preview:
| (a) Sample raster image with Stable Diffusion | (b) Convert raster image to a vector via LIVE | (c) VectorFusion: Fine tune by LSDS |
Script:
python run_painterly_render.py -c vectorfusion.yaml -pt "Electric guitar. pixel art. trending on artstation" -save_step 50 -respath ./workdir/guitar -update "style=pixelart" -d 445997 Prompt: watercolor painting of a firebreathing dragon.
Style: Sketch
Preview:
| SVG initialization | VectorFusion fine-tune 500 step | VectorFusion fine-tune 1500 step |
Script:
python run_painterly_render.py -c vectorfusion.yaml -pt "watercolor painting of a firebreathing dragon. minimal 2d line drawing. trending on artstation" -save_step 50 -respath ./workdir/dragon-sketch -update "style=sketch num_segments=5 radius=0.5 sds.num_iter=1500" -d 106764 # Sketch style
CUDA_VISIBLE_DEVICES=0 python run_painterly_render.py -c vectorfusion.yaml -pt "watercolor painting of a firebreathing dragon. minimal 2d line drawing. trending on artstation" -save_step 50 -respath ./workdir/dragon-sketch -update "style=sketch skip_live=True num_paths=32 num_segments=5 radius=0.5 sds.num_iter=1500" -rdbz
CUDA_VISIBLE_DEVICES=0 python run_painterly_render.py -c vectorfusion.yaml -pt "A cat. minimal 2d line drawing. trending on artstation" -save_step 50 -respath ./workdir/cat-sketch -update "style=sketch skip_live=True num_paths=32 num_segments=5 radius=0.5 sds.num_iter=1500" -rdbzMore Examples:
- check the Examples.md for more cases.
More Scripts:
- check the Run.md for more scripts.
The project is built based on the following repository:
We gratefully thank the authors for their wonderful works.
If you use this code for your research, please cite the following work:
@inproceedings{jain2023vectorfusion,
title={Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models},
author={Jain, Ajay and Xie, Amber and Abbeel, Pieter},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={1911--1920},
year={2023}
}
@inproceedings{xing2023diffsketcher,
title={DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models},
author={XiMing Xing and Chuang Wang and Haitao Zhou and Jing Zhang and Qian Yu and Dong Xu},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=CY1xatvEQj}
}
This repo is licensed under a MIT License.
