Reward based on state changes

Question

Reward based on state changes

hifzajaved opened this issue 6 years ago · 1 comments

Hi, I just had a quick question about possibly designing the reward matrix for a POMDP using the difference in the previous and current states, rather than the action. I see in the examples you have provided, all the reward matrices are dependent only on the previous state and action. Is there a reward function in DESPOT that accepts a different set of arguments?

Answer 1 · 2019-01-24T01:09:28.000Z

Hi, I just had a quick question about possibly designing the reward matrix for a POMDP using the difference in the previous and current states, rather than the action. I see in the examples you have provided, all the reward matrices are dependent only on the previous state and action.

R(s,a, s’) based on the standard definition of the POMDP formalism. —David

…

Is there a reward function in DESPOT that accepts a different set of arguments? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#11>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALUI0koSUVZqLO3zt6oJTH6YORgWZ5tpks5vD7LmgaJpZM4aEJRk>.