/hands-as-probes

Primary LanguageJupyter Notebook

Human Hands as Probes for Interactive Object Understanding

CVPR 2022

Mohit Goyal, Sahil Modi, Rishabh Goyal, Saurabh Gupta
University of Illinois at Urbana-Champaign

[Project Page] [Arxiv]

Abstract: Interactive object understanding, or what we can do to objects and how is a long-standing goal of computer vision. In this paper, we tackle this problem through observation of human hands in in-the-wild egocentric videos. We demonstrate that observation of what human hands interact with and how can provide both the relevant data and the necessary supervision. Attending to hands, readily localizes and stabilizes active objects for learning and reveals places where interactions with objects occur. Analyzing the hands shows what we can do to objects and how. We apply these basic principles on the EPIC-KITCHENS dataset, and successfully learn state-sensitive features, and object affordances (regions of interaction and afforded grasps), purely by observing hands in egocentric videos.

Motivation

Motivation

Prerequisites

Anaconda or miniconda is required for setting up the environment to train the models. You would also need to download EPIC-KITCHENS dataset available here and the hand-object bboxes.

conda env create --file environment.yml
conda activate humanhands

Learning State-Sensitive Features

Please follow the link above to reproduce the pretraining of the state-sensitive features model.

This contains the EPIC-STATES dataset and evaluation code to benchmark the pretrained model.

Learning Object Affordances

Please follow above link to retrain ACP (affordances by context prediction) models.

This contains the EPIC-ROI dataset and evaluation code to benchmark the pretrained model.

This contains the GAO Benchmark and evaluation code to benchmark the pretrained model.

Citing

If you find this work and/or dataset useful in your research, please cite:

@inproceedings{goyal2022human,
      title={Human Hands as Probes for Interactive Object Understanding}, 
      author={Mohit Goyal and Sahil Modi and Rishabh Goyal and Saurabh Gupta},
      year={2022},
      booktitle = {Computer Vision and Pattern Recognition (CVPR)}
}