VideoAgent experiments

The official codebase for running the experiments described in the VideoAgent paper. You can find codebase for training video policies here.

This repository contains the code for training video policies presented in our work
VideoAgent: Self improving video generation
Achint Soni, Sreyas Venkataraman, Abhranil Chandra, Sebastian Fischmeister, Percy Liang, Bo Dai, Sherry Yang website | paper | arXiv | experiment repo

@misc{soni2024videoagentselfimprovingvideogeneration,
      title={VideoAgent: Self-Improving Video Generation}, 
      author={Achint Soni and Sreyas Venkataraman and Abhranil Chandra and Sebastian Fischmeister and Percy Liang and Bo Dai and Sherry Yang},
      year={2024},
      eprint={2410.10076},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.10076}, 
}

Getting started

We recommend to create a new environment with pytorch installed using conda.

conda create -n videoagent_exp python=3.9
conda activate videoagent_exp
conda install pytorch=2.2.0 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Next, clone the repository and install the requirements

git clone https://github.com/Video-as-Agent/VideoAgent_exp.git
cd VideoAgent_exp
pip install -r requirements.txt

Download the Checkpoints

We provide the checkpoints used in our main experiments. You can download them using download.sh, for example:

bash download.sh metaworld
# bash download.sh online
# bash download.sh suggestive
# bash download.sh ithor

Running the Experiments

First, cd into the experiment directory.

cd experiment

Meta-World

To run the full VideoAgent on Meta-World, run the following command:

# make sure you have the checkpoint ../ckpts/metaworld/model-305.pt
bash benchmark_mw.sh 0
# the argument 0 is the GPU id, you can change it to other GPU id if you wish

To run the full VideoAgent-Online on Meta-World, run the following command:

# make sure you have the checkpoint ../ckpts/metaworld/model-3053083.pt
bash benchmark_mw_online.sh 0

To generate metaworld data for experiments, run the following command:

# make sure you change the collection config in collect_dataset_mw.py
python experiment/collect_dataset_mw.py

iTHOR

To run the full VideoAgent on iTHOR, run the following command:

# make sure you have the checkpoint ../ckpts/ithor/model-24.pt
bash benchmark_thor.sh 0

Acknowledgements

This codebase is modified from the following repositories:
AVDC