Dataset for Similarity and Category and Results for method and step level inference?

Question

Closed this issue 3 years ago · 4 comments

Hi thank you for your work! I want to ask about two things:

Would it be possible to release the similarity and category test set? I only see the random strategy in the GDrive.
Could you report the exact accuracy for method-level and step-level that you briefly mentioned in figure 3? Having the results for the best performing ones (the triangle ones) and the human performance (the circles) would be great!

Thank you in advance.

Answer 1 · 2022-01-20T21:15:41.000Z

Hi,

model: method: 0.6972, 0.7431, 0.5316; step: 0.7848, 0.7465, 0.6607
human: method: 0.905, 0.727, 0.74; step: 0.92, 0.8920, 0.86
the order is random, similarity, category.
let me know if you have more questions, thank you!
Yue

Answer 2 · 2022-01-21T03:18:29.000Z

Thank you for your response! Just one more thing that I forgot to ask. What is the exact accuracy for the goal (on random, similarity, category)?

Answer 3 · 2022-01-23T17:40:05.000Z

Please see the results in the table 2 of the paper.

Answer 4 · 2022-01-23T19:40:16.000Z

Oh, I missed that. Thank you!