Probability of Continuing / Discount Modeling
juliusfrost opened this issue · 0 comments
juliusfrost commented
Add the probability of continuing with a dense model. This is described very briefly in appendix A for use in Atari Environments. "We predict the discount factor from the latent state with a binary classifier that is trained towards the soft labels of 0 and γ."
pcont
is the equivalent term in the TensorFlow implementation