Code for our NeurIPS 2023 paper "Neural-Logic Human-Object Interaction Detection".
Contributed by Liulei Li, Jianan Wei, Wenguan Wang, Yi Yang.
Installl the dependencies.
pip install -r requirements.txt
Some files in the LAVIS library have been modified to extract complete visual features from the CLIP model.
HICO-DET dataset can be downloaded here. After finishing downloading, unpack the tarball (hico_20160224_det.tar.gz
) to the data
directory.
Instead of using the original annotations files, we use the annotation files provided by the PPDM authors. The annotation files can be downloaded from here. The downloaded annotation files have to be placed as follows.
data
└─ hico_20160224_det
|─ annotations
| |─ trainval_hico.json
| |─ test_hico.json
| └─ corre_hico.npy
:
First clone the repository of V-COCO from here, and then follow the instruction to generate the file instances_vcoco_all_2014.json
. Next, download the prior file prior.pickle
from here. Place the files and make directories as follows.
GEN-VLKT
|─ data
│ └─ v-coco
| |─ data
| | |─ instances_vcoco_all_2014.json
| | :
| |─ prior.pickle
| |─ images
| | |─ train2014
| | | |─ COCO_train2014_000000000009.jpg
| | | :
| | └─ val2014
| | |─ COCO_val2014_000000000042.jpg
| | :
| |─ annotations
: :
For our implementation, the annotation file have to be converted to the HOIA format. The conversion can be conducted as follows.
PYTHONPATH=data/v-coco \
python convert_vcoco_annotations.py \
--load_path data/v-coco/data \
--prior_path data/v-coco/prior.pickle \
--save_path data/v-coco/annotations
Note that only Python2 can be used for this conversion because vsrl_utils.py
in the v-coco repository shows a error with Python3.
V-COCO annotations with the HOIA format, corre_vcoco.npy
, test_vcoco.json
, and trainval_vcoco.json
will be generated to annotations
directory.
Download the pretrained model of DETR detector for ResNet50, and put it to the params
directory.
python ./tools/convert_parameters.py \
--load_path params/detr-r50-e632da11.pth \
--save_path params/detr-r50-pre-2branch-hico.pth \
--num_queries 64
python ./tools/convert_parameters.py \
--load_path params/detr-r50-e632da11.pth \
--save_path params/detr-r50-pre-2branch-vcoco.pth \
--dataset vcoco \
--num_queries 64
After the preparation, you can start training with the following commands.
sh ./config/hico.sh
sh ./configs/vcoco.sh
Please consider citing our paper if it helps your research.
@article{li2024neural,
title={Neural-logic human-object interaction detection},
author={Li, Liulei and Wei, Jianan and Wang, Wenguan and Yang, Yi},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}
LogicHOI is released under the MIT license. See LICENSE for additional details.
Some of the codes are built upon DETR and GEN-VLKT. Thanks them for their great works!