tpbarron/pytorch-ppo

--use-joint-pol-val is broken

Closed this issue · 0 comments

--use-joint-pol-val

see line 97

def update_params_actor_critic(batch):
    rewards = torch.Tensor(batch.reward)
    masks = torch.Tensor(batch.mask)
    actions = torch.Tensor(np.concatenate(batch.action, 0)) 
    states = torch.Tensor(batch.state)
    values = value_net(Variable(states)) # this line, value_net is not used in the actor critic model