WISE编辑ZsRE-test-all.json报错

Question

WISE编辑ZsRE-test-all.json报错

Closed this issue 2 months ago · 10 comments

使用WISE编辑ZsRE-test-all.json时（在run_knowedit_llama2.py基础上加上了WISE需要的loc_prompts），会报错：

Traceback (most recent call last):
  File "/home/xsong/EasyEdit/examples/run_knowedit_llama2.py", line 230, in <module>
    metrics, edited_model, _ = editor.edit(
  File "/home/xsong/EasyEdit/examples/../easyeditor/editors/editor.py", line 160, in edit
    return self.edit_requests(requests, sequential_edit, verbose, test_generation=test_generation, **kwargs)
  File "/home/xsong/EasyEdit/examples/../easyeditor/editors/editor.py", line 333, in edit_requests
    edit_evaluation(all_metrics, request, edited_model, i, test_generation, icl_examples, **kwargs)
  File "/home/xsong/EasyEdit/examples/../easyeditor/editors/editor.py", line 312, in edit_evaluation
    "post": compute_edit_quality(edited_model, self.model_name, self.hparams, self.tok, request, self.hparams.device, eval_metric=eval_metric, test_generation=test_generation),
  File "/home/xsong/EasyEdit/examples/../easyeditor/evaluate/evaluate.py", line 78, in compute_edit_quality
    compute_locality_quality(model, model_name, hparams, tok, locality_key,
  File "/home/xsong/EasyEdit/examples/../easyeditor/evaluate/evaluate.py", line 153, in compute_locality_quality
    loc_tokens = test_prediction_acc(model, tok, hparams, prompt, locality_ground_truth, device, locality=True, vanilla_generation=hparams.alg_name=='GRACE')
  File "/home/xsong/EasyEdit/examples/../easyeditor/evaluate/evaluate_utils.py", line 126, in test_prediction_acc
    outputs = model(**prompt_target_tok)
  File "/home/xsong/EasyEdit/examples/../easyeditor/models/wise/WISE.py", line 97, in __call__
    return self.model(**kwargs)
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1208, in forward
    outputs = self.model(
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1018, in forward
    layer_outputs = decoder_layer(
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 756, in forward
    hidden_states = self.mlp(hidden_states)
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 240, in forward
    down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
  File "/home/xsong/anaconda3/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xsong/EasyEdit/examples/../easyeditor/models/wise/WISE.py", line 428, in forward
    if min_dist.item() < threshold:
RuntimeError: a Tensor with 2 elements cannot be converted to Scalar

Answer 1 · 2024-09-19T16:05:13.000Z

没遇到过，可以试试别传subject。如果之前都没问题，就从输入上找找原因

Answer 2 · 2024-09-19T16:06:47.000Z

打打断点看看，我这边不会帮忙debug的

Answer 3 · 2024-09-19T16:24:18.000Z

可能是因为WISE代码所处理的数据集，一个edit_prompt只能对应一个locality_prompts_Relation_Specificity/locality_prompts_Forgetfulness/.. ?

Answer 4 · 2024-09-21T02:28:29.000Z

是的，目前只能进行单条(bs=1)推理。bs>1的情况目前没有支持

Answer 5 · 2024-09-24T09:25:32.000Z

好的好的谢谢回复~
还有一个问题~
请问在Vanilla_generation=True这种情况下，wise的性能急剧下降，zsre上的reliability从75.55下降到27.15，generalizability: 71.85->23.64， Locality上升：35.33->91.01, Portability: 53.91->8.48, 这个结果合理嘛？

Answer 6 · 2024-09-24T11:09:17.000Z

似乎是合理的，WISE在token分布纠正层面比较有作用，但是在vanilla_generation（生成token）上可能会遭遇ngram重叠的困境。

Answer 7 · 2024-09-24T11:27:17.000Z

很遗憾，在我最新的实验中即便采用vanilla_generation，WISE依然能够达到近乎100%的reliability和generalization（单条编辑）。我无法复现您遭遇的问题。。。。

Answer 8 · 2024-09-24T11:32:19.000Z

您好上述结果是在sequential_edit=True的场景下编辑zsre_test.json

Answer 9 · 2024-09-24T12:00:48.000Z

整个数据集运行需要比较长的时间，测试sequential时10条case观察到Rel和Gen为0.84, 0.74. 相比token by token的acc确实有所下降，但是没有看到大幅下降到0.3甚至0.2。

Metrics Summary:  {'post': {'rewrite_acc': 0.8400000000000001, 'rephrase_acc': 0.74, 'locality': {'neighborhood_acc': 1.0}, 'portability': {'one_hop_acc': 0.45}}}

第二个我想说的点，你测试的其他方法也都是token by token的acc评测，不是说vanilla generation就代表掌握了知识，只要能够干扰到知识分布以及泛化都叫模型编辑。另外你是否测试了其他方法在vanilla generation下的表现？

Answer 10 · 2024-09-24T12:02:07.000Z

Anyway，你可以在论文/陈述中说明WISE的这一缺点，我不做更多的回复