关于自建训练集的expert_assessment
wyield opened this issue · 4 comments
wyield commented
通过浏览anno.md和下载的数据集可以看到数据结构中还包含了expert_assessment,请问该数据是什么结构的?你们是否对自动生成该类数据汇总了比较好的办法?为什么这类数据索引对于其他数据都-1?
jiaxiaosong1002 commented
@wyield Please read the paper for what those information mean. The -1 issue is because the evaluation of the current environment is from the value function of RL expert in the next step. It is related to RL and Markov chain.
czg2066 commented
您搞懂了expert_assessment中是什么数据了嘛?我遇到了与您一样的疑问,我搞不清expert_assessment里面保存了什么数据。
jayyoung0802 commented
Hi @czg2066, the expert_assessment include 3 parts, expert_feature, RL value and action.
- action = expert_assessment[-1], its range is [0, 38].
- value = expert_assessment[-2], it is the output from RL value head.
- feature = expert_assessment[:-2], it is the feature which feeds to value/actor head.
czg2066 commented
Hi @czg2066, the expert_assessment include 3 parts, expert_feature, RL value and action.您好,expert_assessment包括 expert_feature、RL 值和操作 3 个部分。
- action = expert_assessment[-1], its range is [0, 38].action = expert_assessment[-1],则其范围为 [0, 38]。
- value = expert_assessment[-2], it is the output from RL value head.value = expert_assessment[-2],它是 RL value head 的输出。
- feature = expert_assessment[:-2], it is the feature which feeds to value/actor head.feature = expert_assessment[:-2],它是馈送到 value/actor head 的特征。
ok,thanks.