Dataset for Similarity and Category and Results for method and step level inference?
Closed this issue · 4 comments
zerocstaker commented
Hi thank you for your work! I want to ask about two things:
- Would it be possible to release the similarity and category test set? I only see the random strategy in the GDrive.
- Could you report the exact accuracy for method-level and step-level that you briefly mentioned in figure 3? Having the results for the best performing ones (the triangle ones) and the human performance (the circles) would be great!
Thank you in advance.
YueYANG1996 commented
Hi,
- I just uploaded the test set in google drive.
- model: method: 0.6972, 0.7431, 0.5316; step: 0.7848, 0.7465, 0.6607
- human: method: 0.905, 0.727, 0.74; step: 0.92, 0.8920, 0.86
the order is random, similarity, category.
let me know if you have more questions, thank you!
Yue
zerocstaker commented
Thank you for your response! Just one more thing that I forgot to ask. What is the exact accuracy for the goal (on random, similarity, category)?
YueYANG1996 commented
Please see the results in the table 2 of the paper.
zerocstaker commented
Oh, I missed that. Thank you!