AdaCompNUS/despot

Reward based on state changes

hifzajaved opened this issue · 1 comments

Hi, I just had a quick question about possibly designing the reward matrix for a POMDP using the difference in the previous and current states, rather than the action. I see in the examples you have provided, all the reward matrices are dependent only on the previous state and action. Is there a reward function in DESPOT that accepts a different set of arguments?