代码疑问？

Question

代码疑问？

wxhheian opened this issue a year ago · 2 comments

if direct_inference and prompt in ['Pretended_CoT', 'Knowledge_Enhancement']: outputs = hidden_states[-2][:, -1, :] else: outputs = hidden_states[-1][:, -1, :]

请问下为什么上面是 hidden_states[-2] ，而下面是 hidden_states[-1] 难道不是取最后一层吗?

Answer 1 · 2024-05-09T01:50:37.000Z

Hi~

我们在上传至 ArXiv 的预印版中已经对此进行了详细的说明：

For PromptEOL, we adhere to the methodology proposed in the original paper, taking the output vector corresponding to the concluding quotation mark from the model’s last hidden layer as the sentence embedding. For Pretended CoT and Knowledge Enhancement, we opt for the encoding from the penultimate hidden layer for the terminal token, as this approach demonstrates superior outcomes.

Answer 2 · 2024-05-11T03:42:04.000Z

谢谢，是我没有看仔细