/MROCS

benchmark of MROCS algorithm inspired by MADDPG and my previous work ( wheeler ), benchmark for new framework rewheeler ( comming soon ) on multi agent environment Tennis from Unity

Primary LanguageJupyter Notebook

MROCS

benchmark of MROCS algorithm inspired by MADDPG and my previous work ( wheeler ), benchmark for new framework rewheeler ( comming soon ) on multi agent environment Tennis from Unity

Project Details

  • state_size=24, action_size=2 as default UnityML - Tennis environment provides
    • however for critic we used state_size=48, action_size=4, as MROCS approach ( update of MADDPG algo )
  • 2 players environment, used shared actor + critic for controlling all arms
  • Policy Gradients used, namely MROCS algorithm ( DDPG + MADDPG ideas/implementation and updates )
  • How to install :
    conda install -y pytorch -c pytorch
    pip install unityagents