Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
Luigi Seminara, Giovanni Maria Farinella, Antonino Furnari
NeurIPS (spotlight), 2024
🚧 WORK IN PROGRESS:
- Baselines
- Direct Optimization (DO) Model
- Task Graph Transformer (TGT) Model
- Online Mistake Detection
- Environment configuration
- Data
- Training
- Qualitative results
- Get DO results of Table 1 of the paper
- Citation
- Authors
The code was tested with Python 3.9. Run the following commands to configurate a new conda environment:
conda create -n tgml python=3.9
conda activate tgml
python -m pip install -e ./lib
conda install -c conda-forge pygraphviz
The specified versions of PyTorch and its associated libraries are recommended for optimal compatibility and performance:
- PyTorch: 2.0.1
- Torchvision: 0.15.2
- Torchaudio: 2.0.2
- PyTorch with CUDA: Version 11.7
These packages can be installed using the following command:
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
While these versions are recommended, newer versions of these libraries may also be compatible with the project. If you choose to use alternative versions, please ensure they do not introduce any compatibility issues.
In the ./data directory, you will find the CaptainCook4D data that we have defined for our task. This data is provided in compliance with the license defined by the original authors. Our split differs from those defined by the original authors of the paper, as we have only included annotations that do not contain errors. For more information about the original dataset, please visit the official CaptainCook4D repository.
To generate a single task graph, run:
python train.py -cfg ./configs/CaptainCook4D/Ramen.yaml
Usage: train.py [OPTIONS]
Options:
-cfg, --config TEXT Path to the config file. You can find the config file
in the config folder. [required]
-l, --log Log the output to a file.
-s, --seed INTEGER Seed for reproducibility.
--help Show this message and exit.
To generate all task graphs, run:
python train_all.py --more_seeds
Usage: train_all.py [OPTIONS]
Options:
--more_seeds Use multiple seeds for error bars.
--help Show this message and exit.
The figure reports the generated task graphs of the procedure called "Dressed Up Meatballs". On the left there is the ground truth task graph, while on the right the generated using the Direct Optimization model. These graphs must be interpreted from the bottom up, reflecting the bottom-up nature of dependency edges.
Ground Truth | Generated |
---|---|
Run the following command after python train_all.py --more_seeds
:
python captaincook4d_results.py
If you use the code/models hosted in this repository, please cite the following paper:
@misc{seminara2024differentiable,
title={Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos},
author={Luigi Seminara and Giovanni Maria Farinella and Antonino Furnari},
year={2024},
eprint={2406.01486},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Please, refer to the paper for more technical details.