Augmented Random Search using Numpy

The project aims on building a new type of Artificial intelligence algorithm which is simple and surpasses many already available algorithms for Humanoid or Mu-Jo-Co(Multidimensionla-Joint-with-contact) locomotion related tasks. It simulates a powerful AI Algorithm,called Augmented Random Search (ARS) by training a Half-cheetah (Mu-Jo-Co) to walk and run across a field. to walk and run .

Motivation

Link to the Google-DeepMind's Video

Existing methods

Asynchronous Actor-Critic Agents
Deep Learning
Deep Reinforcement Learning

How is it different

Unlike other AI systems where the exploration occurs after each action (Action Space) , here exploration occurs after end of each episode (Policy space)
ARS is a shallow learning technique unlike deep learning in other AI's systems (Uses only one perceptron rather than layers of it)
ARS discards the technique of Gradient Descent for weight adjustment and uses the Method of Finite Differences

Implementation

Components

Perceptrons
Reward Mechanism and updation of weights
Method of finite Differences to find the best possible direction of movement

Algorithm

Scaling the update step by standard deviation of Rewards.
Online normalization of weights.
Choosing better directions for faster learning.
Discarding directions that yield lowest rewards.

Algorithm Overview

Installation

Fork and clone the repository using git clone https://github.com/ashutoshtiwari13/Simple-Random-Search.git
Run pip install -r requirements.txt
Also check the Simulation.txt for setting up the PyBullet Simulation Environment
Use the Anaconda Cloud - Spyder IDE (Any framework/IDE of your choice)
Use Python 3.6 and above
Run the command python ars.py

Results

Reference Mu-ju-Co

Series of Rewards

Rewards start from being negative as low as -900 and climbs to positive 900 in around 1000 steps.

ashutoshtiwari13/Simple-Random-Search