Yutao Cheng* , Zhao Zhang* , Maoke Yang*
Hui Nie, Chunyuan Li, Xinglong Wu, and Jie Shao
[arXiv 📚]
[Layout Results 🖼️]
[Bibtex 🔗]
Graphist is a design model based on Large Multimodal Model (LMM), designed for Hierarchical Layout Generation (HLG). Unlike traditional graphic layout generation (GLG) tasks that require a predefined sequence of layers, HLG generates graphic compositions from unordered sets of elements. The following figure illustrates the distinction between the two tasks. In HLG, the accuracy of layer ordering and spatial arrangement is crucial for the effectiveness of the final graphic composition.
The following poster are created by volunteers using our Graphist web demo. They can upload design elements, and Graphist will automatically generate a variety of graphic compositions.
Graphist effectively reinterprets HLG by treating it as a sequence generation problem. It accepts RGB-A images as input and produces a JSON draft protocol that specifies the coordinates, dimensions, and sequence of each design element. For an in-depth explanation, please consult our manuscript.
[2024/04/23] Our manuscript is now available on arXiv.
- Release the Graphist checkpoint trained with the Crello dataset
- Publish layout results on the Crello dataset
If you find this work beneficial, please cite it. We look forward to more researchers paying attention to the HLG task.
@article{graphist2023hlg,
title={Graphic Design with Large Multimodal Model},
author={Cheng, Yutao and Zhang, Zhao and Yang, Maoke and Hui, Nie and Li, Chunyuan and Wu, Xinglong and Shao, Jie},
journal={arXiv preprint arXiv:2404.14368},
year={2024}
}