google/gematria

Write comparison script

Opened this issue · 0 comments

It would be good to validate to validate that the benchmarking numbers that we're getting match previous results (like BHive and uica-eval) to ensure that we aren't doing anything egregiously wrong. To do this we need to do a couple things:

  • Write a script (probably python) that can compare CSVs in the BHive format and identify (major) discrepancies.
  • Do a benchmarking run using our tooling against one of these datasets.
  • Run the comparison script, observe the results.