The official codebase for running the experiments described in the VideoAgent paper. You can find codebase for training video policies here.
This repository contains the code for training video policies presented in our work
VideoAgent: Self improving video generation
Achint Soni,
Sreyas Venkataraman,
Abhranil Chandra,
Sebastian Fischmeister,
Percy Liang,
Bo Dai,
Sherry Yang
website | paper | arXiv | experiment repo
@misc{soni2024videoagentselfimprovingvideogeneration,
title={VideoAgent: Self-Improving Video Generation},
author={Achint Soni and Sreyas Venkataraman and Abhranil Chandra and Sebastian Fischmeister and Percy Liang and Bo Dai and Sherry Yang},
year={2024},
eprint={2410.10076},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.10076},
}
We recommend to create a new environment with pytorch installed using conda.
conda create -n videoagent_exp python=3.9
conda activate videoagent_exp
conda install pytorch=2.2.0 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
Next, clone the repository and install the requirements
git clone https://github.com/Video-as-Agent/VideoAgent_exp.git
cd VideoAgent_exp
pip install -r requirements.txt
We provide the checkpoints used in our main experiments. You can download them using download.sh
, for example:
bash download.sh metaworld
# bash download.sh online
# bash download.sh suggestive
# bash download.sh ithor
First, cd into the experiment
directory.
cd experiment
To run the full VideoAgent on Meta-World, run the following command:
# make sure you have the checkpoint ../ckpts/metaworld/model-305.pt
bash benchmark_mw.sh 0
# the argument 0 is the GPU id, you can change it to other GPU id if you wish
To run the full VideoAgent-Online on Meta-World, run the following command:
# make sure you have the checkpoint ../ckpts/metaworld/model-3053083.pt
bash benchmark_mw_online.sh 0
To generate metaworld data for experiments, run the following command:
# make sure you change the collection config in collect_dataset_mw.py
python experiment/collect_dataset_mw.py
To run the full VideoAgent on iTHOR, run the following command:
# make sure you have the checkpoint ../ckpts/ithor/model-24.pt
bash benchmark_thor.sh 0
This codebase is modified from the following repositories:
AVDC