How to evaluate commonsense locality?

Question

How to evaluate commonsense locality?

Closed this issue 2 days ago · 5 comments

Hi,

Thanks for maintaining the repo!

After reading through the codes and your paper: Editing Large Language Models: Problems, Methods, and Opportunities, I am not sure how to evaluate the locality results shown in Table 4 in the paper. The dataset looks like "locality" but I didn't find an example of using it properly. Can you share a minimal example?

Answer 1 · 2024-11-17T06:04:17.000Z

Hi there, you can find it in Appendix B.3.3. For the computation, we combine the question and choice as the input, compute the loss between different choices, and select the one with the minimum loss as the answer.

Answer 2 · 2024-11-17T11:13:20.000Z

Hi buddy, do you have any further questions?

Answer 3 · 2024-11-17T23:23:23.000Z

Thanks for your answer. By loss do you mean using PPL? BTW, were distracting neighbor and other attribution computed in the same logic?

Answer 4 · 2024-11-18T02:41:33.000Z

Yes, PPL.
But, distracting neighbor and other attribution are computed by the token-level exact metric, this can be calculated by our evaluation code.
This means you can directly use our code to get results of distracting neighbor and other attribution but you need to evaluate the reasoning task by your own.

Answer 5 · 2024-11-18T14:44:59.000Z

Hi buddy, do you have any further questions?