Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor : https://arxiv.org/abs/1801.01290
Hopper-v4 | Walker2d-v4 | HalfCheetah-v4 |
---|---|---|
Ant-v4 | Humanoid-v4 | Swimmer-v4 |
---|---|---|
- python == 3.10.0
- numpy == 1.23.5
- torch == 1.13.1
- gymnasium == 0.26.3
- mujoco-py == 2.2.0
if you want see args,
python3 train.py -h
python3 train.py
python3 eval.py
All credits goes to @pranz24 for his brilliant Pytorch implementation of SAC.