EgoObjects is a large-scale egocentric dataset for fine-grained object understanding, which features videos captured by various wearable devices at worldwide locations, objects from a diverse set of categories commonly seen in indoor environments, and videos of the same object instance captured under diverse conditions. The dataset supports both the conventional category-level as well as the novel instance-level object detection task.
For this release, we have annotated 114K frames (79K train, 5.7K val, 29.5K test) sampled from 9K+ videos collected by 250 participants across the world. A total of 14.4K unique object instances from 368 categories are annotated. Among them, there are 1.3K main object instances from 206 categories and 13.1K secondary object instances (i.e., objects accompanying the main object) from 353 categories. On average, each image is annotated with 5.6 instances from 4.8 categories, and each object instance appears in 44.8 images, which ensures diverse viewing directions for the object.
Release v1.0 is publicly available. Images (~40G) can be downloaded from link. Unified annotations for category and instance level object detection can be downloaded from links including train, eval, and metadata. They can be placed under $EgoObjects_ROOT/data/
. We follow the same data format as LVIS with EgoObjects specific changes.
- Linux with Python ≥ 3.8
- PyTorch ≥ 1.8. Install them together at pytorch.org to make sure of this. Note, please check PyTorch version matches that is required by Detectron2.
- Detectron2: follow Detectron2 installation instructions.
conda create --name egoobjects python=3.9
conda activate egoobjects
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
# under your working directory
git clone https://github.com/facebookresearch/EgoObjects.git
cd EgoObjects
If setup correctly, run our evaluation example code to get mock results for category and instance level detection tasks:
python example.py
If you find this code/data useful in your research then please cite our paper:
@inproceedings{zhu2023egoobjects,
title={EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding},
author={Zhu, Chenchen and Xiao, Fanyi and Alvarado, Andrés and Babaei, Yasmine and Hu, Jiabo and El-Mohri, Hichem and Chang, Sean and Sumbaly, Roshan and Yan, Zhicheng},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2023}
}
The code is a re-write of PythonAPI for LVIS. The core functionality is the same with EgoObjects specific changes.
EgoObjects is licensed under the MIT License.