Scene Graph Modification Based on Natural Language Commands
Descriptions
This repo contains source code and pre-processed corpora for "Scene Graph Modification Based on Natural Language Commands" (accepted to Findings of EMNLP 2020) (paper)
Demo
We demonstrate four different operations one can execute on scene graphs:
Dependencies
- python3
- pytorch==1.1
- networkx
- spacy>=2.3.1
Usage
git clone https://github.com/xlhex/SceneGraphModification.git
Data
General Information
We create three different datasets for our scene graph modification task: 1) MSCOCO data, 2) GCC data and 3) crowdsourced data. The first two are constructed with some heuristic approaches, while the last one is crowdsourced from Amazon Mechanical Turk (please refer to our paper for the details).
Each dataset is partitioned into train/dev/test, with each split consisting of the following files:
- source scene graph: {split}_src_graph.bin
- modification query: {split}_src_text.txt
- target scene graph: {split}_tgt_graph.bin
The datasets can be downloaded from here
Train a model
The following code shows how we can train an early fusion (cross-attention) model for a given dataset
cd code
DATA=PATH_TO_YOUR_DATA
CKPT_DIR=
EPOCH=20
FUSION=early
log="${CKPT_DIR}/log.txt"
if [ ! -d $CKPT_DIR ];then
mkdir -p $CKPT_DIR
fi
# build a dictionary for training and inference
python preprocess.py $DATA
python train.py --data-dir $DATA --epochs $EPOCH --seed 1 --ckpt-dir $CKPT_DIR --modification $FUSION --batch-size 256 --accumulation-steps 1 > $log
Inference
The following code shows how we generate a target graph, given the source graph and a modification query
cd code
DATA=PATH_TO_YOUR_DATA
CKPT_DIR=
FUSION=early
python search.py --data-dir $DATA --greedy-search --batch-size 1 --ckpt-dir $CKPT_DIR --modification $FUSION
Instance Visualisation
You can visualise some modification instances. For example, the following code will visualise the first two instances. The rendered source graphs and target graphs can be found at: scripts/display
cd scripts
SRC_GRAPH=PATH_TO_SRC_GRAPH
TGT_GRAPH=PATH_TO_TGT_GRAPH
QUERY=PATH_TO_QUERY
python visualisation.py --src-graph $SRC_GRAPH --tgt-graph $TGT_GRAPH --graph-idx 0,1 --query $QUERY
Citation
Please cite as:
@misc{he2020scene,
title={Scene Graph Modification Based on Natural Language Commands},
author={Xuanli He and Quan Hung Tran and Gholamreza Haffari and Walter Chang and Trung Bui and Zhe Lin and Franck Dernoncourt and Nhan Dam},
year={2020},
eprint={2010.02591},
archivePrefix={arXiv},
primaryClass={cs.CL}
}