/CollageRL

ICCV 2023 - Neural Collage Transfer: Artistic Reconstruction via Material Manipulation

Primary LanguageJupyter NotebookOtherNOASSERTION

Neural Collage Transfer: Artistic Reconstruction via Material Manipulation (ICCV 2023)

Ganghun Lee, Minji Kim, Yunsu Lee, Minsu Lee, and Byoung-Tak Zhang

Description

An official implementation of the paper Neural Collage Transfer: Artistic Reconstruction via Material Manipulation.

Examples

Requirements

  • Python 3.8.5 (Conda)
  • PyTorch 1.11.0

We recommend using the following instruction after making a new Python 3.8.5 environment:
$ pip install -r requirements.txt

Inference

You can find infer.sh for testing your own image.

Goal image

The goal image should be placed in samples/goal/.
ex) samples/goal/boat.jpg

Material images

The materials are a set of images, so please make your own folder (e.g., newspaper/) containing all your material images.
Then move the folder to the directory samples/materials/.
ex) samples/materials/newspaper/

To make it quick, you can download a prepared set of newspapers from here.
(Vilkin,Aleksey and Safonov,Ilia. (2014). Newspaper and magazine images segmentation dataset. UCI Machine Learning Repository. https://doi.org/10.24432/C5N60V.)
There would be some kinds of files, but we only need the .jpgs (please delete the other files).

Instruction

Please make sure to set your goal/material path in infer.sh.
GOAL_PATH='samples/goals/your_own_goal.jpg' (not necessarily .jpg extension)
SOURCE_DIR='samples/materials/your_own_material_folder'

Now you can run the code.
$ bash infer.sh
It will take some time, and the results will be saved at samples/results/.

Configuration

  • GOAL_RESOLUTION - result image resolution
  • GOAL_RESOLUTION_FIT - fit the resolution as (horizontal | vertical | square)
  • SOURCE_RESOLUTION_RATIO - material image casting size (0-1)
  • SOURCE_LOAD_LIMIT - max num of material images to load (prevent RAM overloaded)
  • SOURCE_SAMPLE_SIZE - num of material images agent will see at each step
  • MIN_SOURCE_COMPLEXITY - minimum allowed complexity for materials (prevent using too simple ones) (>=0)
  • SCALE_ORDER - scale sequence for multi-scale collage
  • NUM_CYCLES - num of steps for each sliding window
  • WINDOW_RATIO - stride ratio of sliding window (0-1) (0.5 for stride = window_size x 0.5)
  • MIN_SCRAP_SIZE - the minimum allowed scrap size (prevent too small scraps) (0-1)
  • SENSITIVITY - complexity-sensitivity value for multi-scale collage
  • FIXED_T - fixed value of t_channel for multi-scale collage
  • FPS - fps for result video

You can also toggle the following options:

  • skip_negative_reward - Whether to undo actions that led to a negative MSE reward
  • paper_like - Whether to use the torn paper effect
  • disallow_duplicate - Whether to disallow duplicate usage of materials

We recommend trying adjusting SENSITIVITY first, in a range of about 1-5.

Training

Dataset

Goals and materials should be prepared for training.

Goal set

This code supports the following datasets for goals:

Material set

This code properly supports Describable Textures Dataset (DTD) for training only.

Please make your datasets be placed in the same data directory.
As an example, you can see our example tree of ~/Datasets.

Datasets/
├── dtd
│   ├── images
│   ├── imdb
│   └── labels
├── flowers-102
│   ├── imagelabels.mat
│   ├── jpg
│   └── setid.mat
├── imagenet
│   ├── meta.bin
│   ├── train
│   └── val
├── IntelScene
│   ├── train
│   └── val
└── MNIST
    └── raw

Then set --data_path in train.sh to your data directory.
ex) --data_path ~/Datasets

Wandb

Before training, please set up and log in to your wandb account for logging.

Instruction

Set --goal in train.sh to right name (imagenet | mnist | flower | scene).
Tip: imagenet is for general use.

--source means material, and it basically supports dtd only.
But you can use other materials for specific goal-material cases: (imagenet-imagenet, mnist-mnist, flower-flower, scene-scene).

Now just run the code to train:
$ bash train.sh

The progress and result will be saved at outputs/.
If your RAM get overloaded, you can decrease the replay memory size --replay_size.

Renderer (Shaper)

To make the rendering process differentiable, we implemented and pretrained shaper network as in shaper/shaper_training.ipynb.
We also used Kornia library for differentiable image translation.

Citation

If you find this work useful, please cite the paper as follows:

@inproceedings{lee2023neural,
  title={Neural Collage Transfer: Artistic Reconstruction via Material Manipulation},
  author={Lee, Ganghun and Kim, Minji and Lee, Yunsu and Lee, Minsu and Zhang, Byoung-Tak},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={2394--2405},
  year={2023}
}

Acknowledgements

Many thanks to the authors of Learning to Paint for inspiring this work. They also inspired our other work From Scratch to Sketch.
We also appreciate the contributors of Kornia for providing useful differentiable image processing operators.