huanzhang12/SA_PPO

Robust deep reinforcement learning with adversarial attacks.

potatoKiller0 opened this issue · 2 comments

Hi, in your paper, you compare your result with 'Robust deep reinforcement learning with adversarial attacks.'. I can't find the code. Could you provide it,thanks.

I implemented their algorithm but the code is not released in this repository. However, you can find an implementation of their algorithm in the new repository of my recent ICLR paper on robust ATLA-PPO: https://github.com/huanzhang12/ATLA_robust_RL

Using the new code repository, you can run the following command (this uses the Ant environment as an example):

python run.py --config-path config_ant_vanilla_ppo.json --robust-ppo-eps 0.15 --attack-method critic --attack-ratio 1.0 --collect-perturbed-states true

--robust-ppo-eps specifies the epsilon for training time adversarial attack; --attack-method is the attack algorithm for training time attack (default is the critic attack; see here for a list of attacks; for example, you can also try our MAD attack); --attack-ratio specifies the percentage of states under attack (you can try 0.5 and 1.0; sometimes 1.0 leads to divergence). --collect-perturbed-states true means collecting the trajectory using the perturbed observations (rather than the ground-truth states) during training; I found that it is important for their algorithm to converge.

Thanks.