/Soft-Module

Code for "Multi-task Reinforcement Learning with Soft Modularization"

Primary LanguagePython

Soft-Module

Implementation for "Multi-task Reinforcement Learning with Soft Modularization"

Paper Link: Multi-Task Reinforcement Learning with Soft Modularization

Demo

Our project page is at https://rchalyang.github.io/SoftModule/

Setup Environment

Environement Requirements

  • Python 3
  • Pytorch 1.7
  • posix_ipc
  • tensorboardX
  • tabulate, gym
  • MetaWorld(Please check next section to set-up MetaWorld)
  • seaborn(for plotting)

MetaWorld Setup

We evaluated our method on MetaWorld.

Since MetaWorld is under active development, we perform all the experiment on our forked MetaWorld(https://github.com/RchalYang/metaworld).

#Our MetaWorld installation
git clone https://github.com/RchalYang/metaworld.git
cd metaworld
pip install -e .

Our Network Structure

See ModularGatedCascadeCondNet in torchrl/networks/nets.py for details

Training

All log and snapshot would be stored logging directory. Logging directory is default to be "./log/EXPERIMENT_NAME".

EXPERIMENT_NAME can be set with "--id" argument when start experiment. And prefix directory can be set with "--log_dir" argument)

# Modular Network // MT10-Conditioned // Shallow
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt10/modular_2_2_2_256_reweight_rand.json --id MT10_Conditioned_Modular_Shallow --seed SEED --worker_nums 10 --eval_worker_nums 10


# Modular Network // MT10-Fixed // Shallow
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt10/modular_2_2_2_256_reweight.json --id MT10_Fixed_Modular_Shallow --seed SEED --worker_nums 10 --eval_worker_nums 10


# Modular Network // MT10-Conditioned // Deep
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt10/modular_4_4_2_128_reweight_rand.json --id MT10_Conditioned_Modular_Deep --seed SEED --worker_nums 10 --eval_worker_nums 10


# Modular Network // MT10-Fixed // Deep
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt10/modular_4_4_2_128_reweight.json --id MT10_Fixed_Modular_Deep --seed SEED --worker_nums 10 --eval_worker_nums 10


# Modular Network // MT50-Conditioned // Shallow
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt50/modular_2_2_2_256_reweight_rand.json --id MT50_Conditioned_Modular_Shallow --seed SEED --worker_nums 50 --eval_worker_nums 50


# Modular Network // MT50-Fixed // Shallow
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt50/modular_2_2_2_256_reweight.json --id MT50_Fixed_Modular_Shallow --seed SEED --worker_nums 50 --eval_worker_nums 50


# Modular Network // MT50-Conditioned // Deep
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt50/modular_4_4_2_128_reweight_rand.json --id MT50_Conditioned_Modular_Deep --seed SEED --worker_nums 50 --eval_worker_nums 50


# Modular Network // MT50-Fixed // Deep
python starter/mt_para_mtsac_modular_gated_cas.py --config meta_config/mt50/modular_4_4_2_128_reweight.json --id MT50_Fixed_Modular_Deep --seed SEED --worker_nums 50 --eval_worker_nums 50

Plot Training Curve

To plot the training curves, you could use the following command.

  • id argument is used for multiple experiment names.

  • seed argument is used for multiple seeds

  • replace "mean_success_rate" with different entry to see different curve for different entry.

python torchrl/utils/plot_csv.py --id EXPERIMENTS --env_name mt10 --entry "mean_success_rate" --add_tag POSTFIX_FOR_OUTPUT_FILES --seed SEEDS

Citation

If you find our work useful, please cite our work.

@misc{yang2020multitask,
      title={Multi-Task Reinforcement Learning with Soft Modularization}, 
      author={Ruihan Yang and Huazhe Xu and Yi Wu and Xiaolong Wang},
      year={2020},
      eprint={2003.13661},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}