Reward based on state changes
hifzajaved opened this issue · 1 comments
hifzajaved commented
Hi, I just had a quick question about possibly designing the reward matrix for a POMDP using the difference in the previous and current states, rather than the action. I see in the examples you have provided, all the reward matrices are dependent only on the previous state and action. Is there a reward function in DESPOT that accepts a different set of arguments?
davidyhsu commented
Hi, I just had a quick question about possibly designing the reward matrix for a POMDP using the difference in the previous and current states, rather than the action. I see in the examples you have provided, all the reward matrices are dependent only on the previous state and action.
R(s,a, s’) based on the standard definition of the POMDP formalism.
—David
… Is there a reward function in DESPOT that accepts a different set of arguments?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#11>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALUI0koSUVZqLO3zt6oJTH6YORgWZ5tpks5vD7LmgaJpZM4aEJRk>.