JosephKJ/OWOD

Why is there a certain difference in recurring results?

Closed this issue · 3 comments

Except for the inconsistent number of GPUs used, the other configurations are the same. Regarding the known classes, I got similar results to the author. However, on the A-OSE measurement, my results are quite different from those in the paper. Below are my recurring results for the four tasks. I hope the author can answer my question. Thanks!
image

Please note that other have independently been able to reproduce the results (link). WI is bit noisy, but other metrics should match up. Have you tried scaling the hyper params according to the discussion here?

Hello @JosephKJ @yuandhu I was sucessfully able to reproduce the results and also attached the table for the same

  ResNet-50 ORE (paper)   56.34   8234 0.02193 52.37 25.58 38.98   7772 0.0154 37.77 12.41 29.32   6634 0.0081 30.01 13.44 26.66
  ORE (retrained)   55.86 6390 0.04867 52.36 25.04 38.71 7635 0.0387 37.52 12.28 29.11 0.11 6151 0.0275 29.63 12.82 25.43
  Recall 71.92     71.74 46.53 `     57.12 33.69 49.31 9.11     51.99 36.17 48.03

Hello @JosephKJ @yuandhu I was sucessfully able to reproduce the results and also attached the table for the same

  ResNet-50 ORE (paper)   56.34   8234 0.02193 52.37 25.58 38.98   7772 0.0154 37.77 12.41 29.32   6634 0.0081 30.01 13.44 26.66
  ORE (retrained)   55.86 6390 0.04867 52.36 25.04 38.71 7635 0.0387 37.52 12.28 29.11 0.11 6151 0.0275 29.63 12.82 25.43
  Recall 71.92     71.74 46.53 `     57.12 33.69 49.31 9.11     51.99 36.17 48.03

thank you