/IKEA_ASM_Dataset

Primary LanguagePythonOtherNOASSERTION

IKEA Assembly Dataset

This repo contains code for the "IKEA assembly dataset". This is a dev repo, after cleanup, it will be publicly available on Github.

Link to google drive video dataset ~240GB

Link to project website

The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose

Introduction

This is the code for processing the IKEA assembly dataset.

This work will be presented in WACV 2021.

Abstract:

The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks. In the context of understanding human activities, existing public datasets, while large in size, are often limited to a single RGB camera and provide only per-frame or per-clip action annotations. To enable richer analysis and understanding of human activities, we introduce IKEA ASM---a three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose. Additionally, we benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset. The dataset enables the development of holistic methods, which integrate multi-modal and multi-view data to better perform on these tasks.

Citation

If you find this dataset useful in your research, please cite our work:

Preprint:

@article{ben2020ikea,
  title={The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose},
  author={Ben-Shabat, Yizhak and Yu, Xin and Saleh, Fatemeh Sadat and Campbell, Dylan and Rodriguez-Opazo, Cristian and Li, Hongdong and Gould, Stephen},
  journal={arXiv preprint arXiv:2007.00394},
  year={2020}
}

WACV2021:

@inproceedings{ben2021ikea,
  title={The ikea asm dataset: Understanding people assembling furniture through actions, objects and pose},
  author={Ben-Shabat, Yizhak and Yu, Xin and Saleh, Fatemeh and Campbell, Dylan and Rodriguez-Opazo, Cristian and Li, Hongdong and Gould, Stephen},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={847--859},
  year={2021}
}

Installation

Please first download the dataset using the provided links: Full dataset download

Alternatively, you can download only the relevant parts:

After downloading the video data, extract the individual frames using ./toolbox/extract_frames_from_videos.py For further processing of the data refer to the individual benchmarks README.md files.

For depenencies see requirements.txt.

Benchmarks

We provide several benchmarks:

  • Action recognition
  • Pose Estimation
  • Part segmentation and tracking

Please refer to the README.md file in the individual benchmark dirs for further details on training, testing and evaluating the different benchmarks (action recognition, pose estiamtion, intance segmentation, and part tracking). Make sure to download the relevant pretrained models from the links above.

License

Our code is released under MIT license (see LICENCE.txt file).