simonmeister/pysc2-rl-agents

NaNs output from the policy network

stevenbinhu21 opened this issue · 2 comments

Hi, I was training the model for one of the minigames. And after about 60000 episodes, the policy for the actions probabilities output all NaNs, I haven't been able to track down the problem yet, as it takes long to get to that point for debugging. Just wonder if you have encountered such problem before?

Which of the games are you training? Yes, we also encountered it occasionally in DefeatRoaches and DefeatZerglingsAndBanelings (if I remember correctly), however it only was for some runs and I also wasn't able to identify the source of the problem.

I am also encountering NaN while running DefeatZerglingsAndBanelings and CollectMineralShards as well. Is there any updates on this issue?