Profile-Guided Optimization (PGO) evaluation results

Question

Profile-Guided Optimization (PGO) evaluation results

zamazan4ik opened this issue a year ago · 2 comments

Hi!

Recently I did a lot of benchmarks for measuring Profile-Guided Optimization (PGO) effects on different projects (including some libraries) - the results are available here. So I decided to test PGO with tantivy as well.

My test setup is a Macbook M1 Pro, macOS 13.4 Ventura. All tests are done on the same hardware. Rust version - 1.72. The PGO optimization is done with cargo-pgo. As a training and evaluation set, I use Tantiv benchmarks. The background load was kept the same (as much as I can guarantee on macOS, ofc). The results are the following (in the cargo bench output format):

Release: https://pastebin.com/dFk03qrT
PGO optimized compared to Release: https://pastebin.com/VGfNCee0
PGO Instrumented compared to Release (so you can evaluate how Tantivy is slow in the Instrumentation mode): https://pastebin.com/a006aHVZ

This information can be helpful:

For the Tantivy users who want to optimize their applications
For benchmark purposes as an additional way to extract more performance

Probably would be a good idea to mention PGO somewhere in the Tantivy documentation/README/Wiki .

Answer 1 · 2023-08-30T11:28:35.000Z

Thanks for these investigations, the results are quite interesting.
It seems you ran the benchmarks with cargo bench. The full suite is behind a feature flag:
cargo +nightly bench --features unstable

The numbers suggest that the default compilation has a lot of leeway.
I'd like to annotate the rust code to replicate the resulting binary, but as far as I know there's no interface to llvm for that, except inline.

Answer 2 · 2023-08-30T11:35:27.000Z

It seems you ran the benchmarks with cargo bench. The full suite is behind a feature flag:
cargo +nightly bench --features unstable

Didn't know that! Would be a good idea to test PGO on the full benchmark set.

I'd like to annotate the rust code to replicate the resulting binary, but as far as I know there's no interface to llvm for that, except inline.

Well, it can be quite difficult to maintain, to be honest. That's why PGO shines here - all "annotation" machinery is done by a compiler, not by a human. As you mentioned above, not all optimizations done by PGO can be replicated in the source code via different LLVM attributes. A human annotation, however, has a benefit since it does not require a double compilation.