/Reinforcement-Learning

Primary LanguageJupyter NotebookMIT LicenseMIT

Reinforcement-Learning

Overview

Reinforcement learning is the training of machine learning models to make a sequence of decisions. The agent learns to achieve a goal in an uncertain, potentially complex environment.

Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

Reinforcement learning seems to be the most likely way to make a machine creative – as seeking new, innovative ways to perform its tasks is in fact creativity.

Definition

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.

Reinforcement Learning was then able to proceed to mastering the playing of Chess, and of Go, and of countless electronic games.

There are five key elements of reinforcement learning models:

Agent: The algorithm/function in the model that performs the requested task.

Environments: The world in which the agent carries out its actions.

States: It refers to the situation of the agent in an environment.

Actions: The moves are chosen and performed by the agent to gain rewards.

Rewards: Reward means desired behaviours which are expected from the agent

History

Farley and Clark described another neural-network learning machine designed to learn by trial and error.

In the 1960s the terms "reinforcement" and "reinforcement learning" were used in the engineering literature for the first time.

Sutton & Barto (2018) discuss the three ‘threads’ of Reinforcement Learning as being:

  1. Learning by trial-and-error
  2. The problem of optimal control
  3. Temporal difference learning methods.

These threads were pursued by researchers independently before becoming intertwined in the 1980’s leading to the concept of Reinforcement Learning as we know it today.

Approaches

Reinforcement Learning are divided into several categories:

Associative reinforcement learning: Associative reinforcement learning tasks combine facets of stochastic learning automata tasks and supervised learning pattern classification tasks

Deep reinforcement learning: This approach extends reinforcement learning by using a deep neural network and without explicitly designing the state space.

Adversarial deep reinforcement learning: Adversarial deep reinforcement learning is an active area of research in reinforcement learning focusing on vulnerabilities of learned policies.

Fuzzy reinforcement learning: By introducing fuzzy inference in RL, approximating the state-action value function with fuzzy rules in continuous space becomes possible.

Inverse reinforcement learning: In inverse reinforcement learning (IRL), no reward function is given. Instead, the reward function is inferred given an observed behavior from an expert.

Safe reinforcement learning: It can be defined as the process of learning policies that maximize the expectation of the return in problems in which it is important to ensure reasonable system performance and/or respect safety constraints during the learning and/or deployment processes.

Partially supervised reinforcement learning: Partially supervised approaches can alleviate the need for extensive training data in supervised learning while reducing the need for costly exhaustive random exploration in pure RL.

Algorithms

Some Reinforcement Learning algorithms are listed below;

Markov Decision Processes (MDPs): It is a framework that is used to model decision making processes. The decision maker, the states, actions and rewards are the key elements of MDPs.

Q-Learning: It does not need a model to learn the value of the actions and there is no policy.

State-Action-Reward-State-Action: It is an algorithm to learn a Markov decision process policy.

Applications

Common areas where reinforcement learning is used are listed below:

Computer Games: Pac-Man is a well-known and simple example.

Industrial Automation and Robotics: Reinforcement learning helps industrial applications and robotics to gain the skills themselves for performing their tasks.

Traffic Control Systems: Reinforcement learning is used for real-time decision-making and optimisation for traffic control activities.

Advertising: Reinforcement learning supports businesses and marketers to create personalized content and recommendations.

Zindi has hosted some challenges based on Machine-Learning Solutions.

CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0