关于自建训练集的expert_assessment

Question

关于自建训练集的expert_assessment

wyield opened this issue 4 months ago · 4 comments

    通过浏览anno.md和下载的数据集可以看到数据结构中还包含了expert_assessment，请问该数据是什么结构的？你们是否对自动生成该类数据汇总了比较好的办法？为什么这类数据索引对于其他数据都-1？

Answer 1 · 2024-09-05T05:37:39.000Z

@wyield Please read the paper for what those information mean. The -1 issue is because the evaluation of the current environment is from the value function of RL expert in the next step. It is related to RL and Markov chain.

Answer 2 · 2024-09-29T10:40:28.000Z

您搞懂了expert_assessment中是什么数据了嘛？我遇到了与您一样的疑问，我搞不清expert_assessment里面保存了什么数据。

Answer 3 · 2024-09-29T10:58:38.000Z

Hi @czg2066, the expert_assessment include 3 parts, expert_feature, RL value and action.

action = expert_assessment[-1], its range is [0, 38].
value = expert_assessment[-2], it is the output from RL value head.
feature = expert_assessment[:-2], it is the feature which feeds to value/actor head.

Answer 4 · 2024-09-29T11:03:13.000Z

Hi @czg2066, the expert_assessment include 3 parts, expert_feature, RL value and action.您好，expert_assessment包括 expert_feature、RL 值和操作 3 个部分。

action = expert_assessment[-1], its range is [0, 38].action = expert_assessment[-1]，则其范围为 [0， 38]。

value = expert_assessment[-2], it is the output from RL value head.value = expert_assessment[-2]，它是 RL value head 的输出。

feature = expert_assessment[:-2], it is the feature which feeds to value/actor head.feature = expert_assessment[：-2]，它是馈送到 value/actor head 的特征。

ok，thanks.