Omegastick/pytorch-cpp-rl

where is the formula in c++ file

fatalfeel opened this issue ยท 2 comments

https://github.com/Mikoto10032/DeepLearning/blob/master/books/%5B%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%5D%5BHung-yi%20Lee%5D/PPO%20(v3).pdf

in this pdf page 9. formula as this
๐‘๐œƒ ๐œ = ๐‘ ๐‘ 1 ๐‘๐œƒ ๐‘Ž1|๐‘ 1 ๐‘ ๐‘ 2|๐‘ 1, ๐‘Ž1 ๐‘๐œƒ ๐‘Ž2|๐‘ 2 ๐‘ ๐‘ 3|๐‘ 2, ๐‘Ž2 โ‹ฏ

where is the formula in c++ file? which function implement it? or where define it?
help me find out

AFAIK, that formula is a natural consequence of the policy gradient algorithm, and not directly defined anywhere in the code.

In Bayes network its real calculate the conditional probability (http://dlib.net/bayes_net_ex.cpp.html)

PPO algorithm have this formula ex: ๐‘๐œƒ(๐‘Ž๐‘ก|๐‘ t)
https://github.com/Mikoto10032/DeepLearning/blob/master/books/%5B%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%5D%5BHung-yi%20Lee%5D/PPO%20(v3).pdf

I can not connect the ๐‘๐œƒ(๐‘Ž๐‘ก|๐‘ t) to source code... or a lot of summation
Y = W x Input + B represent this probability?
I am confused with the formula relate to source code. please help solve it