MMSceneneGraph is an open source code hub for scene graph generation as well as supporting downstream tasks based on the scene graph on PyTorch. The frontend object detector is supported by open-mmlab/mmdetection.
-
Modular design
We decompose the framework into different components and one can easily construct a customized scene graph generation framework by combining different modules.
-
Support of multiple frameworks out of box
The toolbox directly supports popular and contemporary detection frameworks, e.g. Faster RCNN, Mask RCNN, etc.
-
Visualization support
The visualization of the groundtruth/predicted scene graph is integrated into the toolbox.
This project is released under the MIT license.
Please refer to CHANGELOG.md for details.
The original object detection results and models provided by mmdetection are available in the model zoo. The models for the scene graph generation are temporarily unavailable yet.
Supported SGG (VRD) methods:
- Neural Motifs (CVPR'2018)
- VCTree (CVPR'2019)
- TDE (CVPR'2020)
- VTransE (CVPR'2017)
- IMP (CVPR'2017)
- KERN (CVPR'2019)
- GPSNet (CVPR'2020)
- HetH (ECCV'2020, ours)
- TopicSG (ICCV'2021, ours)
Supported saliency object detection methods:
- R3Net (IJCAI'2018)
- SCRN (ICCV'2019)
Supported image captioning methods:
- bottom-up (CVPR'2018)
- XLAN (CVPR'2020)
Supported datasets:
- Visual Genome: VG150 (CVPR'2017)
- VRD (ECCV'2016)
- Visual Genome: VG200/VG-KR (ours)
- MSCOCO (for object detection, image caption)
- RelCap (from VG and COCO, ours)
As our project is built on mmdetection 1.x (which is a bit different from their current master version 2.x), please refer to INSTALL.md. If you want to use mmdetection 2.x, please refer to mmdetection/get_start.md.
Please refer to GETTING_STARTED.md for using the projects. We will update it constantly.
We appreciate the contributors of the mmdetection project and Scene-Graph-Benchmark.pytorch which inspires our design.
If you find this code hub or our works useful in your research works, please consider citing:
@inproceedings{wang2021topic,
title={Topic Scene Graph Generation by Attention Distillation from Caption},
author={Wang, Wenbin and Wang, Ruiping and Chen, Xilin},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
pages={15900--15910},
month = {October},
year={2021}
}
@inproceedings{wang2020sketching,
title={Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation},
author={Wang, Wenbin and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
pages={222--239},
year={2020},
volume={12358},
doi={10.1007/978-3-030-58601-0_14},
publisher={Springer}
}
@InProceedings{Wang_2019_CVPR,
author = {Wang, Wenbin and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
title = {Exploring Context and Visual Pattern of Relationship for Scene Graph Generation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
pages = {8188-8197},
month = {June},
address = {Long Beach, California, USA},
doi = {10.1109/CVPR.2019.00838},
year = {2019}
}