google/gematria

Parallelize benchmarking

Closed this issue · 1 comments

With the large scale of our datasets (potentially 10^8 BBs), we will need a reasonably fast way to benchmark basic blocks. Parallelizing this is an obvious first step. This needs a couple things implemented on the LLVM side:

  • Shared memory names (used for memory annotations) need a name that is also based on the thread ID rather than just the process ID.
  • There needs to be an option to pin a benchmarking process to a specific core within llvm-exegesis.

(There might be more on the llvm-exegesis side).

Then, we need to do the following:

  • Implement parallel benchmarking using LLVM threading primitives.
  • Validate that running on multiple threads doesn't impact results (using validation counters).
  • Ship it.

Doing this outside of the process with ray seems to work much better.