- ➤ 📝 About The Project
- ➤ 💾 Key Project File Description
- ➤ 🚀 Dependencies
- ➤ 🔨 Usage
- ➤ ☕ Buy me a coffee
- ➤ 📜 Credits
- ➤ License
This repository is my personal collection and demonstration of various deep reinforcement learning (DRL) algorithms, showcasing my grasp and application of advanced concepts in the field. Each model's directory provides richly commented code, designed to display not just the technical implementation but also my understanding of the strategic underpinnings of each algorithm.
- The
DQN
directory implements the DQN algorithm. DQN extends Q-learning by using deep neural networks to approximate the Q-value function. The code includes network architecture, experience replay, and the epsilon-greedy strategy for action selection. It is primarily based on the paper Playing Atari with Deep Reinforcement Learning by Mnih et al, (2015).
- The
DDPG
folder contains the implementation of DDPG, a policy gradient algorithm that uses a deterministic policy and operates over continuous action spaces. The folder manages network updates, policy learning, and the Ornstein-Uhlenbeck process for action exploration. The foundational paper is Continuous control with deep reinforcement learning by Lillicrap et al, (2016).
- The
TD3
file is used for the TD3 algorithm, an extension of DDPG that reduces function approximation error by using twin Q-networks and delayed policy updates. This approach is elaborated in the paper by Addressing Function Approximation Error in Actor-Critic Methods Fujimoto et al, (2018).
- The
PPO
fodler facilitates the implementation of PPO, which optimizes policy learning by maintaining a balance between exploration and exploitation using a clipped surrogate objective. The algorithm is detailed in the paper Proximal Policy Optimization Algorithms by Schulman et al, (2017).
- The
MADDPG
folder explores the MADDPG framework, designed for multi-agent environments. It extends DDPG by considering the actions of other agents in the environment, enhancing training stability and performance in cooperative or competitive scenarios. The key concepts are discussed in the paper Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments by Lowe et al, (2017).
- The
MAPPO
folder implements MAPPO, adapting the robust single-agent PPO algorithm for multi-agent settings. This file includes adaptations for centralized training with decentralized execution, suitable for complex multi-agent scenarios. The approach is based on findings discussed in the paper The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games by Yu et al, (2022).
- COMING SOON
To model the algorithms I used the PyTorch framework only
The easiest way to get started with the deep reinforcement learning algorithms in this repository, is to set up a local development environment. Follow these steps to install and run the implementations:
Clone the repository:
git clone https://github.com/i1Cps/reinforcement-learning-work.git
cd reinforcement_learning_work
Create a virtual environment (optional but recommended):
python3 -m venv env
source env/bin/activate # On Windows use `env\Scripts\activate` I think lol
Install the required dependencies:
pip3 install -r requirements.txt
Run a specific algorithm (example with PPo):
cd algorithms/ppo
python3 main.py
Plot the results:
cd data
python3 plot.py
View graphs plots in:
algorithms/<specific-algorithm>/data/plots
Whether you use this project, have learned something from it, or just like it, please consider supporting it by buying me a coffee, so I can dedicate more time on open-source projects like this (҂⌣̀_⌣́)
Theo Moore-Calters
Special Thanks to:
Licensed under MIT.