XanderJC/scalable-birl

question about experiment settings

chenmao001 opened this issue · 2 comments

Your work is solid and promising. I have a question here: DSFN uses DQN (which agent interacts with the environment) to optimize policy, when updating reward once. You also compare DSFN with AVRIL, what settings make DSFN offline or it is still online?
Thank you very much.

DSFN is entirely offline - the first line of the abstract should clarify that "We introduce a novel Inverse Reinforcement Learn- ing (IRL) method for batch settings where only expert demonstrations are given and no interaction with the environment is allowed"
Paper:
https://www.ijcai.org/Proceedings/2019/819

Hope that helps

Thanks for the quick reply, I just checked the DSFN code ,and found that the replay buffer used by its inner loop DQN is fixed expert data . It is indeed offline, thanks.