/ObjectStateChange

Primary LanguagePythonOtherNOASSERTION

Object State Change Classification

This is the imlementation of submission (TarHeels) for the Ego4D: Object State Change Classification Challenge at 1st Ego4D Workshop, CVPR 2022. We use a transformer-based video recognition model and leverage the Divided Space-Time Attention mechanism for classifying object state change in egocentric videos. Our submission achieves the second-best performance in the challenge.

Technical Report

You can download the technical report of our submission from here.

Steps to run the codebase

  • Follow the instruction from timeSformer for setup and installation.
  • Run create_fho_clips.py for processing and creating video clips.
  • Run create_fho_dataset.py for creating the dataset.
  • Use following command to run the train the model.
python tools/run_net.py \
 --cfg configs/Ego4dFho/TimeSformer_divST_8x32_224.yaml \
 DATA.PATH_TO_DATA_DIR path_to_your_dataset \
 NUM_GPUS 8 \
 TRAIN.BATCH_SIZE 8 \
  • Finally, run generate_submission.py to generate submission file for the challenge.