Language Control Diffusion

Build status Dependencies Status Code style: black Security: bandit Pre-commit Semantic Versions License

Efficiently scaling through space, time, and tasks

๐Ÿš€ Features

Fast, efficient and easy training

  • We've spent months optimizing this code to make diffusion models train fast in CALVIN, and can now offer fully training to 300K gradient steps in 8 hours, and evaluation on 1000 tasks in 6 hours. We get these numbers on the A10G 24GB.

Flexible policy abstraction levels

  • You can easily adjust the amount of burden to put on the high level vs the low level policy by simply adjusting the temporal stride, or the clip_stride as referred to in the code

Universal interface

  • Our high level diffusion policy is independent of the low level policy. This means you can plug and play different low level policies and environments.

SOTA performance on MT-LHC CALVIN

  • We achieve an average 88.7% success on the horizon length 1 SR of the multitask long-horizon control problem in CALVIN. This is the highest performance yet of any model that incorporates no inductive biases on the problem.

Cached CALVIN Datasets

  • We offer a daemon script that runs in the background, caching the entire shared memory dataset of HULC in the background to save you ~20 minutes at the start of each training run, which aggregates to hours and possibly days over the course of a full research project

Development features

๐Ÿงช Prior Experiment Logs

An example of a recent LCD training run for your reference can be found here. We will also be releasing the entire experiment logs of all experiments done for this project soon.

โš“ Installation

We require either Mambaforge or Conda and git lfs. We highly recommend Mambaforge (a faster drop-in replacement of conda). Conda is also supported. To install, simply run

$ make install

This will set up a new conda environment lcd. It will also download all necessary data (~9.5 GB/), including

  • HULC seeds
  • On-policy offline datasets
  • LCD seeds

Options

If you would like to install just the repository without downloading data, run

$ make NO_DATA=1 install

If you just want to download the data, run

$ git clone https://github.com/ezhang7423/hulc-data.git --recurse-submodules

Troubleshooting

We have thoroughly tested the installation process and environment with NVIDIA GPUs CUDA Versions 11.6, 11.7, and 12.0 on Ubuntu 18.04, 20.04, and AlmaLinux 8.7 (similar to RHEL). Running on windows or macOS will likely present difficulties. If any part of the installation fails, you can activate the environment with mamba/conda activate lcd and try debugging what the issue is by running individual commands in the install section of the Makefile.

If you see this error:

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - pytorch3d=0.7.2 -> python[version='>=2.7,<2.8.0a0|>=3.11,<3.12.0a0|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0']

Your python: python=3.8

try using mamba instead of conda.

Usage

Once the repository is installed, all scripts can be run with lcd. This is also equivalent to directly running python ./src/lcd/__main__.py, which may be preferable when using a debugger.

> lcd                                                          (lcd) 
                                                                     
 Usage: lcd [OPTIONS] COMMAND [ARGS]...                              
                                                                     
โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --install-completion          Install completion for the current  โ”‚
โ”‚                               shell.                              โ”‚
โ”‚ --show-completion             Show completion for the current     โ”‚
โ”‚                               shell, to copy it or customize the  โ”‚
โ”‚                               installation.                       โ”‚
โ”‚ --help                        Show this message and exit.         โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ Commands โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ rollout     Rollout in the environment for evaluation or dataset  โ”‚
โ”‚             collection                                            โ”‚
โ”‚ train_hulc  Train the original hulc model                         โ”‚
โ”‚ train_lcd   Train the original hulc model                         โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Train HULC

lcd train_hulc

Evaluate HULC

lcd rollout hulc

Generate on-policy data

lcd rollout generate

Train LCD

lcd train_lcd

Evaluate LCD

lcd rollout lcd

It's really that easy! If you would like to pass more training options to hulc or diffuser, both can be done in the exact same way as in the original repositories (see here for hulc and here for diffuser). These scripts will simply pass the arguments to the underlying original code. It should be noted that the dataset key is no longer supported for diffuser as there is only one dataset, and that the task_D_D dataset needs to be downloaded first by following the directions at ./language-control-diffusion/submodules/hulc-baseline/dataset.

If using wandb, please change the key wanb_entity in ./lcd/config/calvin.py and ./submodules/hulc-baseline/conf/logger/wandb.yaml to your team or username.

๐Ÿ“ˆ Releases

You can see the list of available releases on the GitHub Releases page.

We follow Semantic Versions specification, and use the Release Drafter. As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when youโ€™re ready. With the categories option, you can categorize pull requests in release notes using labels.

List of labels and corresponding titles

Label Title in Releases
enhancement, feature ๐Ÿš€ Features
bug, refactoring, bugfix, fix ๐Ÿ”ง Fixes & Refactoring
build, ci, testing ๐Ÿ“ฆ Build System & CI/CD
breaking ๐Ÿ’ฅ Breaking Changes
documentation ๐Ÿ“ Documentation
dependencies โฌ†๏ธ Dependencies updates

GitHub creates the bug, enhancement, and documentation labels for you. Dependabot creates the dependencies label. Create the remaining labels on the Issues tab of your GitHub repository, when you need them.

๐Ÿ—๏ธ Development

Directory Structure

.
โ”œโ”€โ”€ CODE_OF_CONDUCT.md
โ”œโ”€โ”€ CONTRIBUTING.md
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ Makefile
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ SECURITY.md
โ”œโ”€โ”€ cookiecutter-config-file.yml
โ”œโ”€โ”€ poetry.lock
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ setup.cfg
โ”œโ”€โ”€ submodules
โ”‚   โ”œโ”€โ”€ hulc-baseline # original hulc code
โ”‚   โ””โ”€โ”€ hulc-data
โ”‚       โ”œโ”€โ”€ (indices,ordering,seq).pt # evaluation sequences sorted by task, useful for single task evaluation
โ”‚       โ”œโ”€โ”€ README.md
โ”‚       โ”œโ”€โ”€ annotations.json # the training and evaluation language descriptions of all tasks
โ”‚       โ”œโ”€โ”€ default_1000_sequences.pt # the default 1000 tasks that are generated during evaluation. this is cached to speed up eval by ~3 min per run.
โ”‚       โ”œโ”€โ”€ hulc-baselines-30 # a collection of 3 seeds of HULC, each run for 30 epochs trained in the same manner with the original code checkpointed at hulc-baseline
โ”‚       โ”œโ”€โ”€ hulc-trajectories # the offline on-policy datasets used to train lcd. these are saved in the latent space of the LLP, after processing through its encoder
โ”‚       โ”œโ”€โ”€ lcd-seeds # high level diffusion policies trained on the aforementioned seeds 
โ”‚       โ””โ”€โ”€ t5-v1_1-xxl_embeddings.pt # t5 XXL embeddings of all annotations
โ””โ”€โ”€ src
    โ””โ”€โ”€ lcd
        โ”œโ”€โ”€ __init__.py
        โ”œโ”€โ”€ __main__.py # the primary entrypoint. delegates commands by automatically parsing the apps directory
        โ”œโ”€โ”€ __pycache__
        โ”œโ”€โ”€ apps # this holds all the ways of interacting with this repo
        โ”‚   โ”œโ”€โ”€ __pycache__
        โ”‚   โ”œโ”€โ”€ rollout.py # rollout allows you to evaluate hulc, lcd, and collect on-policy datasets
        โ”‚   โ”œโ”€โ”€ train_hulc.py # a wrapper around the original hulc training code in hulc-baseline.
        โ”‚   โ””โ”€โ”€ train_lcd.py # a wrapper around the diffuser training code, which calls ./src/lcd/apps/diffuser.py
        โ”œโ”€โ”€ config
        โ”‚   โ”œโ”€โ”€ __pycache__
        โ”‚   โ””โ”€โ”€ calvin.py # holds all configuration for training diffuser on the calvin benchmark
        โ”œโ”€โ”€ datasets
        โ”‚   โ”œโ”€โ”€ __init__.py
        โ”‚   โ”œโ”€โ”€ __pycache__
        โ”‚   โ””โ”€โ”€ sequence.py # load the prior mentioned on-policy datasets
        โ”œโ”€โ”€ models
        โ”‚   โ”œโ”€โ”€ __init__.py
        โ”‚   โ”œโ”€โ”€ __pycache__
        โ”‚   โ”œโ”€โ”€ diffusion.py
        โ”‚   โ”œโ”€โ”€ helpers.py
        โ”‚   โ””โ”€โ”€ temporal.py
        โ”œโ”€โ”€ scripts # older entrypoints that will be transitioned to apps
        โ”‚   โ”œโ”€โ”€ diffuser.py # training and evaluation of diffuser
        โ”‚   โ””โ”€โ”€ generation
        โ”‚       โ”œโ”€โ”€ dataset.py # generate the on-policy dataset from the goal space
        โ”‚       โ”œโ”€โ”€ embeddings.py # generate the t5 XXL embeddings
        โ”‚       โ””โ”€โ”€ task_orderings.py # generate an updated version of the (indices,ordering,seq).pt file in ./submodules/hulc-data
        โ””โ”€โ”€ utils
            โ”œโ”€โ”€ __init__.py
            โ”œโ”€โ”€ __pycache__
            โ”œโ”€โ”€ arrays.py
            โ”œโ”€โ”€ config.py
            โ”œโ”€โ”€ eval.py # the meat of the rollout and evaluation code
            โ”œโ”€โ”€ git_utils.py
            โ”œโ”€โ”€ serialization.py
            โ”œโ”€โ”€ setup.py
            โ”œโ”€โ”€ timer.py
            โ””โ”€โ”€ training.py # the meat of the diffuser training code
  1. Install and initialize poetry and install pre-commit hooks:
make install
make pre-commit-install
  1. Run the codestyle:
make codestyle

Makefile usage

Makefile contains a lot of functions for faster development.

1. Download and remove Poetry

To download and install Poetry run:

make poetry-download

To uninstall

make poetry-remove

2. Install all dependencies and pre-commit hooks

Install requirements:

make install

Pre-commit hooks coulb be installed after git init via

make pre-commit-install

3. Codestyle

Automatic formatting uses pyupgrade, isort and black.

make codestyle

# or use synonym
make formatting

Codestyle checks only, without rewriting files:

make check-codestyle

Note: check-codestyle uses isort, black and darglint library

Update all dev libraries to the latest version using one comand

make update-dev-deps
4. Code security

make check-safety

This command launches Poetry integrity checks as well as identifies security issues with Safety and Bandit.

make check-safety

5. Type checks

Run mypy static type checker

make mypy

6. Tests with coverage badges

Run pytest

make test

7. All linters

Of course there is a command to rule run all linters in one:

make lint

the same as:

make test && make check-codestyle && make mypy && make check-safety

8. Docker

make docker-build

which is equivalent to:

make docker-build VERSION=latest

Remove docker image with

make docker-remove

More information about docker.

9. Cleanup

Delete pycache files

make pycache-remove

Remove package build

make build-remove

Delete .DS_STORE files

make dsstore-remove

Remove .mypycache

make mypycache-remove

Or to remove all above run:

make cleanup

Poetry

Want to know more about Poetry? Check its documentation.

Details about Poetry

Poetry's commands are very intuitive and easy to learn, like:

  • poetry add numpy@latest
  • poetry run pytest
  • poetry publish --build

etc

๐ŸŽฏ What's next

Replanning, further probing the language generalization, trying discrete state-action spaces, scaling the model are all interesting research directions to explore. As far as further improvements to this repository go, we are planning to add

  1. Parallel training (multiple runs, one per gpu)
  2. Diffuser from pixels
  3. BC-Z and RT-1 ablations

๐Ÿ›ก License

License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

๐Ÿ“ƒ Citation

@article{language-control-diffusion,
  author={Zhang, Edwin and Lu, Yujie and Wang, William and Zhang, Amy},
  title={LAD: Language Control Diffusion: efficiently scaling through Space, Time, and Tasks},
  year = {2023},
  journal={arXiv preprint arXiv:2210.15629},
  howpublished = {\url{https://github.com/ezhang7423/language-control-diffusion}}
}

๐Ÿ‘ Credits ๐Ÿš€ Your next Python package needs a bleeding-edge project structure.

Massive thanks to Oier Mees and Luka Shermann for providing te CALVIN and HULC codebases, and huge thanks to Michael Janner and Yilun Du for providing the Diffuser codebase. This work would not be possible without standing on the shoulders of these giants.

Template: python-package-template