Hands23 data

Data repository for the paper: Towards A Richer 2D Understanding of Hands at Scale (NeurIPS2023).

Tianyi Cheng*, Dandan Shan*, Ayda Hassen, Richard Higgins, David Fouhey.

It contains the SAM mask generation and annotation visulation for Hands23 dataset.

Download and unzip

wget https://fouheylab.eecs.umich.edu/~dandans/projects/hands23/data/hands23_data.zip
unzip hands23_data.zip -d where/to/unzip/

After unzip, the folder structure will look like below. allMergedSplit contains the train/val/test splits. allMergedBlur contains all JPG images. allMergedTxt contains all TXT annotation files correspond to JPG images by only adding .txt suffix.

hands23_data/
├── allMergedSplit
|   ├── TEST.txt
|   ├── TRAIN.txt
|   └── VAL.txt
├── allMergedBlur
|   └── *.jpg
├── allMergedTxt
|    └── *.jpg.txt
└── masks_sam (to be created in the next step)
    └── *.png

SAM mask generation

The downloaded data hands23_data.zip has all other annotations, except for SAM masks. Please follow Segment Anything for the environment installation and downlowd the model checkpoint, the ViT-H SAM model (sam_vit_h_4b8939.pth). Then, you can generate SAM masks for raw format data by simply running

python sam/get_sam_masks.py

Visualization

Visualize annotations: load from raw format and plot annotations on images.

python vis/vis_hands23.py

Citing

If you find this data and code useful for your research, please consider citing Hands23 paper,

@inproceedings{cheng2023towards,
        title={Towards a richer 2d understanding of hands at scale},
        author={Cheng, Tianyi and Shan, Dandan and Hassen, Ayda Sultan and Higgins, Richard Ely Locke and Fouhey, David},
        booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
        year={2023}
        }

and also make sure to cite the following paper where the subsets originate: COCO (Lin et al.), VISOR (Darkhalil et al.) and Artic. (Qian et al.).

@inproceedings{lin2014microsoft,
  title={Microsoft coco: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={European conference on computer vision},
  pages={740--755},
  year={2014},
  organization={Springer}
}
@inproceedings{VISOR,
           title={EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations},
           author={Darkhalil, Ahmad and Shan, Dandan and Zhu, Bin and Ma, Jian and Kar, Amlan and Higgins, Richard and Fidler, Sanja and Fouhey, David and Damen, Dima},
           booktitle   = {Proceedings of the Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks},
           year      = {2022}
} 
@inproceedings{Qian22,
    author = {Shengyi Qian and Linyi Jin and Chris Rockwell and Siyi Chen and David F. Fouhey},
    title = {Understanding 3D Object Articulation in Internet Videos},
    booktitle = {CVPR},
    year = 2022
}