tensorflow/tensorboard

Enable Link-Time Optimization (LTO) and evaluate other optimizations for TensorBoard

zamazan4ik opened this issue · 1 comments

Hi!

Recently I read the design document about some design choices and performance-oriented questions. I want to suggest several optimization techniques that can improve the project performance further.

I noticed that in the Cargo.toml file for the Rust part Link-Time Optimization (LTO) is not enabled. I suggest enabling it since it will reduce the binary size (always a good thing to have) and will likely improve the application's performance.

I suggest enabling LTO only for the Release builds to not sacrifice the developers' experience during the working on the project since LTO consumes an additional amount of time to finish the compilation routine. If you think that a regular Release build should not be affected by such a change as well, then I suggest adding an additional release-lto profile where additionally to regular release optimizations LTO also will be added. Such a change simplifies life for maintainers and others interested in the project persons who want to build the most performant version of the application.

After applying LTO I can suggest evaluating other optimization options like evaluating Profile-Guided Optimization (PGO) (more materials and benchmarks are available at my repo: https://github.com/zamazan4ik/awesome-pgo) and Post-Link Optimization (PLO) with tools like LLVM BOLT.

Thank you.

Please feel free to open a PR, it would be interesting to analyze the impact.