move37

This repository is related to:

Homework_Assignment_Week2

This solution is based on:

https://github.com/aaksham/frozenlake

How to run

cd Homework_Assignment_Week2
python DeterministicFrozenLake.py

Homework_Assignment_Week3

This solution is based on:

https://oneraynyday.github.io/ml/2018/05/24/Reinforcement-Learning-Monte-Carlo/#example-cliff-walking

How to run

cd Homework_Assignment_Week3
python blackjack.py

python CliffWalking.py

Midterm Assignment: Make a Bipedal Robot Walk (Week5)

Task

https://www.theschool.ai/courses/move-37-course/lessons/midterm-assignment-make-a-bipedal-robot-walk/

"The midterm is to make a bipedal humanoid robot walk in a simulation.
You can use OpenAI Gym for the environment.
https://github.com/search?q=bipedal+gym
This link shows some potential solutions that you can use to help you when you build your own.

We’re looking for good documentation, readable code, and bonus points for using reinforcement learning in a novel way for this challenge."

Solution

This assigment has two solutions:

DQN

This solution is based on:

How to run

cd Homework_Assignment_Week5/dqn/
python dqn.py

A2C

This solution is based on:

Deep Reinforcement Learning Hands-On by Maxim Lapan
https://www.packtpub.com/big-data-and-business-intelligence/deep-reinforcement-learning-hands

Chapter 14 & Chapter 15:

Continuous Action Space, The Actor-Critic (A2C) method
Trust Regions – TRPO, PPO, and ACKTR

How to run

cd Homework_Assignment_Week5/a2c/
python 01_train_a2c.py --name bipedal --cuda
python 02_play.py --model saves/a2c-bipedal/<<your data file>>.dat --save 45

Homework_Assignment_Week6

This solution is based on:

https://github.com/AndersonJo/dqn-pytorch

Homework_Assignment_Week7

This solution is based on:

How to run

cd Homework_Assignment_Week7/FlappyBird/
python flappybird.py

How to run

cd Homework_Assignment_Week7/NeuroEvolution-Flappy-Bird/
python flappy.py

Homework_Assignment_Week8

Solution 1: REINFORCE

This solution is based on:

https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Policy%20Gradients/Cartpole/Cartpole%20REINFORCE%20Monte%20Carlo%20Policy%20Gradients.ipynb

How to run

cd Homework_Assignment_Week8/Deep_reinforcement_learning_Course/
jupyter notebook Lunar\ Lander\ REINFORCE\ Monte\ Carlo\ Policy\ Gradients.ipynb

Solution 2: REINFORCE

This solution is based on:

How to run

cd Homework_Assignment_Week8/OpenAI_Gym_AI
python OpenAI.py -m train
python OpenAI.py -m test

Homework_Assignment_Week9

Re-implement A2C but in Tensorflow

This solution is based on:

https://github.com/ikostrikov/pytorch-a3c

Final Project (Multi Agent Research Project) (Week10)

Task

"Reproduce the Deep Deterministic Policy Gradients algorithm for a multi-agent particle environment.
The algorithm should learn how to get both agents to ‘tag’ each other."

Solution - Tensorflow

This solution is based on:

How to run

To install, cd into the directory (Homework_Assignment_Week10/tensorflow1_multiagent) and type pip install -e .
Known dependencies: Python (3.5.4), OpenAI gym (0.9.5), numpy (1.13.1)
Additional installation instructions: https://github.com/openai/multiagent-particle-envs

cd Homework_Assignment_Week10/tensorflow1_multiagent

python3 ddpg_tag.py --env simple_tag_guided --experiment_prefix ./results/ddpg_1v1/

python3 ddpg_tag.py --env simple_tag_guided_1v2 --experiment_prefix ./results/ddpg_1v2/

python3 ddpg_tag.py --env simple_tag_guided_2v1 --experiment_prefix ./results/ddpg_2v1/

python3 ddpg_tag.py --env simple_tag_guided_2v2 --experiment_prefix ./results/ddpg_2v2/

python3 maddpg_tag.py --env simple_tag_guided --experiment_prefix ./results/maddpg_1v1/

python3 maddpg_tag.py --env simple_tag_guided_1v2 --experiment_prefix ./results/maddpg_1v2/

python3 maddpg_tag.py --env simple_tag_guided_2v1 --experiment_prefix ./results/maddpg_2v1/

python3 maddpg_tag.py --env simple_tag_guided_2v2 --experiment_prefix ./results/maddpg_2v2/

Use --render parameter for rendering the process.

spil-peter-forgacs/move37

move37

Homework_Assignment_Week2

How to run

Homework_Assignment_Week3

How to run

Midterm Assignment: Make a Bipedal Robot Walk (Week5)

Task

Solution

DQN

How to run

A2C

How to run

Homework_Assignment_Week6

Homework_Assignment_Week7

How to run

How to run

Homework_Assignment_Week8

Solution 1: REINFORCE

How to run

Solution 2: REINFORCE

How to run

Homework_Assignment_Week9

Re-implement A2C but in Tensorflow

Final Project (Multi Agent Research Project) (Week10)

Task

Suggested readings

Solution - Tensorflow

How to run

Other solutions

Tensorflow 2

PyTorch 1

PyTorch 2

PyTorch 3