Adding a sample_action method for ActorCritic
lemikhovalex opened this issue · 0 comments
lemikhovalex commented
Hello! I've been learning how to code RL form your repo. I've replace duplicating code lines from
def train
def update_policy
to agent's method self.sample_action(). And it seems that agent now solves Cart-Pole problem x2 slower(num of episodes). And it happes everytime. I have no idea what happens with torch and havn't found anything on Internet.
Can you pls help me?
https://github.com/lemikhovalex/pytorch-rl
5_tr - Proximal Policy Optimization (PPO) [CartPole]-Copy1.ipynb