🍸️RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments

This is the official repository for RIO benchmark, including dataset downloads, baseline method implementation and validation. If you find our dataset helpful in your work, please ⭐star and 📖cite our work.

🏡Project page 🔗arXiv

📝News

Our work has been accepted by the NeurIPS 2023 d&b track. We will be progressively testing the performance of MLLMs on RIO.

📌Data Example

You can follow the steps in example.ipynb to read and visualize some sample ground truth.

Here is a ground truth example from the dataset:

The intention description is "you can use the thing to cut the food on the table".

More examples are in the file ./examples.

📑Data Format

The annotation contains:

"height": height of the image,
"width": width of the image, 
"image_id": COCO image id,
"task_id": per-intention id, 
"expressions": intention description, e.g., "You can use the thing to cook meal."
"bbox_list": The bbox annotations of all instances of the class that can satisfy the intention.
"mask_list": The mask annotations of all instances of the class that can satisfy intention.

📁Data Download

Image

Our dataset is constructed based on COCO2014 dataset, you can download them from MSCOCO, our training set images are from COCO train set, the images of common and uncommon test are from COCO val set.

Annotation

You can download the latest version of data annotation from here.

🧰Baseline Methods

We provide checkpoints for the baseline models, and you can follow the environment configuration details of the original repository and our modified code and scripts to verify the performance of these models on the RIO dataset.

1. MDETR & TOIST

cd baselines/TOIST_RIO

TOIST is built on top of the MDETR repository, and there is only a difference between distilled (TOIST) and undistilled (MDETR).

We've organized the data into the same format, which you can download from here and put into the data/coco-tasks/annotations/ directory.

You can organize the 'data' folder as follows:

data/
  ├── id2name.json
  ├── images/
  │    ├── train2014/
  │    └── val2014/
  └── coco-tasks/
       └── annotations/
            ├── refcoco_task_test_common.json
            ├── refcoco_task_test_uncommon.json
            └── refcoco_task_train.json

Download the corresponding model checkpoint and run bash scripts/test_wo_distill_seg.sh to evaluate MDETR, run bash scripts/test_w_distill_seg.sh to evaluate TOIST. You can choose to evaluate "common" or "uncommon" at refexp_test_set in configs/tdod_rio.json.

Model	AP50_Det	Top1_Det	mIoU_Seg	Top1_Seg	Checkpoint
MDETR@common	48.61	65.05	44.14	54.55	link
MDETR@uncommon	24.20	39.60	22.03	34.35	link
TOIST@common	49.05	66.72	45.07	55.85	link
TOIST@uncommon	21.96	39.28	19.41	34.00	link

2. Polyformer

cd baselines/polygon-transformer

Please refer to PolyFormer for environment configuration, including fairseq and refer.

Train

Download pre-processed tsv data of train set from here and put it into datasets/finetune/reftask/.
Run scripts after acess run_scripts/finetune/.

cd run_scripts/finetune/
bash train_polyformer_b_reftask.sh

Evaluation

Download pre-processed tsv data from here and put into datasets/finetune/reftask/.
Download our trained model from here.
Run scripts after access run_scripts/evaluation/.

cd run_scripts/evaluation/
bash evaluate_polyformer_b_reftask.sh

Model	mIoU_common	mIoU_uncommon	Checkpoint
Polyformer	46.16	26.77	link

Demo

The original repository PolyFormer updated demo.py, with our model weight trained on RIO dataset, you can immediately test intention-oriented segmentation in the demo.

🍞Acknowledgement

Data loading and inference scripts are built on MDETR, TOIST, SeqTR, PolyFormer repositories.

qumengxue/RIO