/rl-hopper

A novel approach to Guided Domain Randomization through the implementation of an Adversarial Agent, trained to confuse the Task Agent in order to produce more robust results

Primary LanguageJupyter Notebook

Guided Domain Randomization through Adversarial Agent

A novel approach to Guided Domain Randomization through the implementation of an Adversarial Agent, trained to confuse the Task Agent in order to produce more robust results. Read the paper discussing the ideas, implementation and results here. Team members: Simone Carena, Francesco Paolo Carmone, Ludovica Mazzucco

Official assignment at Google Doc.

Python Virtual Environment

Start a virtual environment with Python 3.10.0, then install all the dependencies with

    pip install -r requirements.txt

This code has been developed and tested using Arch Linux x86_64, kernel 6.7.6-arch1-2, and Wayland. Alternatively, the .ipynb are platform agnostic and can be used interchangeably.

Files

Available gym environments

  • Source: the hopper, whose mass torso value is shifted by one.
  • Target: the hopper
  • UDR: the hopper, whose mass torso value is shifted by one. Once the reset function is called, the hopper is assigned new masses uniformly and randomly generated
  • Deceptor: hopper, whose mass torso value is shifted by one. Once the reset function is called, the hopper, but the masses are generated by the Adversarial Agent

An example of the standard plot nomenclature is source -> target. That means that the agent has been trained on source, and is later tested on target.

File description

task2.py trains source.mdl

task3.py trains target.mdl, and and tests source -> source, source -> target and target -> target. Please note: if there's a file named target.mdl, it just tests it.

task4.py trains dr_model.mdl, and tests drsource -> target. Please note: if there's a file named dr_model.mdl, it just tests it.

train_adversarial_agent.py: Trains the adversarial agent and saves it as deception_model_agent_dr. Run with --help to check all the available commands.

train_multi_adversarial_agents.py: Trains 5 different models, with seeds ranging from 1 to 5, and saves them. Run with --help to check all the available commands.

test_models: Tests the following cases: source -> target, target -> target, drsource -> target, and deceptorsource -> target