codegen-units=1 + LTO causes 3-5% performance regression for sequential code
Opened this issue · 1 comments
Code
I tried this code:
The bug trigger:
Line 374: BigUint variable
↓
Line 377: Loop (1M+ iterations)
↓
Line 383: Arithmetic operation
↓
With codegen-units=1 + LTO
↓
LLVM over-inlines Line 383
↓
Register pressure (16 GPRs on x86_64)
↓
Stack spilling
↓
-5% performance regression
I expected to see this happen: "may improve performance"
Instead, this happened: 3-5% slower
- seq_integers: -5.06% (26.1ms → 27.5ms)
- seq_with_step: -4.98% (13.3ms → 14.0ms)
- expand_custom_tabstops: -2.73% (36.6ms → 37.6ms)
- cut_fields_custom_delim: +32.29% (40.7ms → 30.8ms)
- cut_fields_tab: +26.13% (34.1ms → 27.0ms)
- Overall: -10.02% (22 improvements, 10 regressions)
Related
Hmm. I will let the rest of T-compiler judge but it is not clear to me we can attempt to improve this as opposed to e.g. LLVM doing so, and it seems the flags are resulting in performance according to spec. As you note yourself:
may improve the performance
"may improve" also means it may not improve, and not improving often means decreasing it.
This is not a regression in the sense that label means. A regression as we track it is a regression against previous compiler behavior, so this would only be something to report as a regression if a previous compiler version did better.
@rustbot label: -regression-untriaged +T-compiler -C-bug +C-optimization +A-LLVM +A-LTO