Official implementation of DiagrammerGPT, a novel two-stage text-to-diagram generation framework that leverages the layout guidance capabilities of LLMs to generate more accurate open-domain, open-platform diagrams.
Abhay Zala, Han Lin, Jaemin Cho, Mohit Bansal
- Diagram Plan Generation Source Code
- Diagram Generation Source Code
- AI2D-Caption Dataset Release
An overview of DiagrammerGPT, our two-stage framework for open-domain, open platform diagram generation.
- In the first diagram planning stage, given a prompt, our LLM (GPT-4) generates a diagram plan, which consists of dense entities (objects and text labels), fine-grained relationships (between the entities), and precise layouts (2D bounding boxes of entities). Then, the LLM iteratively refines the diagram plan (i.e., updating the plan to better align with the input prompts).
- In the second diagram generation stage, our DiagramGLIGEN outputs the diagram given the diagram plan, then, we render the text labels on the diagram.
If you find our project useful in your research, please cite the following paper:
@article{Zala2023DiagrammerGPT,
author = {Abhay Zala and Han Lin and Jaemin Cho and Mohit Bansal},
title = {DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning},
year = {2023},
}