Download the CGL-dataset from the official website. To get clean images, use LaMa to erase the texts from train images. Put clean images in DatasetRoot/images/train
Download the testing data and text features from the url and unzip to DatasetRoot. The final DatasetRoot structure should be the following.
DatasetRoot
├── annotations
│ ├── train.json
│ ├── test.json
├── images
│ ├── train
│ ├── test
├── text_content
│ ├── train.txt
│ ├── test.txt
├── text_features
│ ├── train
│ ├── test
The source code is based on DiffusionDet and Detectron2.
(1). Install Detectron2
pytorch=1.8.0 python=3.7 cuda=11.1
pip install -r requirements.txt
Before training, you need to modify the DATASET_PATH, OUTPUT_DIR and TEXT_FEATURE_PATH in configs/radm.yaml
python3 train_net.py --num-gpus 1 \
--config-file configs/radm.yaml \
Train with multiple GPUs from checkpoints.
python3 train_net.py --num-gpus 4 \
--config-file configs/radm.yaml \
--resume
python train_net.py --num-gpus 1 \
--config-file configs/radm.yaml \
--eval-only --resume
To evaluate the layout result, you need modify the test_imgdir, test_annotation and test_label in metrics.py first. The functions to compute
python metrics.py
@inproceedings{fengheng2023relation,
author = {Li, Fengheng and Liu, An and Feng, Wei and Zhu, Honghe and Li, Yaoyu and Zhang, Zheng and Lv,
Jingjing and Zhu, Xin and Shen, Junjie and Lin, Zhangang and Shao, Jingping},
title = {Relation-Aware Diffusion Model for Controllable Poster Layout Generation},
year = {2023},
booktitle = {Proceedings of the 32nd ACM International Conference on Information and Knowledge Management},
pages = {1249–1258},
}