PyTorch implementation of Soft Actor-Critic (https://arxiv.org/pdf/1801.01290.pdf) (deep reinforcement learning algorithm) tested on inverted pendulum swingup problem (OpenAI gym environment) which is a classic problem in control. The goal is to swing the pendulum up so it stays upright while it starts in a random position. (https://gym.openai.com/envs/Pendulum-v0/)
To run the code, you need to have installed the following libraries/softwares on your system (preferably Ubuntu or any linux distro):
- python: Required version >= 3.5. Also, installing pip is useful:
sudo apt install python3-pip
. (if your package manager is apt) - PyTorch: Recommeded to install via pip. https://pytorch.org/
- numpy:
pip install numpy
- jupyter:
pip install jupyter
- matplotlib:
pip install matplotlib
- seaborn:
pip install seaborn
- IPython:
sudo apt install python3-ipython
- tqdm:
pip install tqdm
- OpenAI gym: https://gym.openai.com/docs/
It is recommended to run the code in a virtualenv.
Install the required softwares and clone this repo. To test the code or perform experiments run a new jupyter session using
jupyter notebook
on terminal which launches the jupyter notebook app in a browser. In the notebook dashboard, navigate to find the notebook softac
and run it.
To train/test the model, execute
python softac.py
- gym_utils.py: Some utility functions to get parameters of the gym environment used, e.g. number of states and actions.
- model.py: Deep learning network for the agent.
- replay_buffer.py: A replay buffer to store state-action transitions and then randomly sample from it.
- softac.ipynb: Soft Actor-Critc implementation in a jupyter notebook for testing the code and performing experiments.
- softac.py: Implementation of the algorithm for training and testing on the task of inverted pendulum (default).
The repo is still under construction. To report bugs or add changes, open a pull request.