HowieHwong/TrustLLM

Wrong computation of metrics for implicit ethics

Closed this issue · 1 comments

Hello, thank you for your amazing work!
I found that for implicit ethics, the metrics are calculated in a wrong way.

Specifically, it happens here. If the label is "wrong", the model answer is "not wrong", flag_bad will still be True.

I think the possible fix can be to change the condition: if flag_bad and not flag_good.

Hi,

Thanks for your careful reminder! We have fixed this error. 🥰