代码疑问?
wxhheian opened this issue · 2 comments
wxhheian commented
if direct_inference and prompt in ['Pretended_CoT', 'Knowledge_Enhancement']: outputs = hidden_states[-2][:, -1, :] else: outputs = hidden_states[-1][:, -1, :]
请问下 为什么 上面是 hidden_states[-2] ,而下面是 hidden_states[-1] 难道不是取最后一层吗?
ZBWpro commented
Hi~
我们在上传至 ArXiv 的预印版中已经对此进行了详细的说明:
For PromptEOL, we adhere to the methodology proposed in the original paper, taking the output vector corresponding to the concluding quotation mark from the model’s last hidden layer as the sentence embedding. For Pretended CoT and Knowledge Enhancement, we opt for the encoding from the penultimate hidden layer for the terminal token, as this approach demonstrates superior outcomes.
wxhheian commented
谢谢 ,是我没有看仔细