Reordering in gcc and clang
HDembinski opened this issue · 8 comments
Hi, I saw your lightning talk. I think the reason why you didn't see an effect of noexcept in the reordering test for gcc and clang is that these compilers are too clever. They were able to prove that the increment and decrement have no observable effect and removed them, irregardless of the throw.
You can see this here in a greatly simplified version:
https://godbolt.org/z/M-yu-e
The increment is only carried out, when it has an observable side-effect. For example, because the value is reported as part of the exception:
https://godbolt.org/z/e52Wq3
Very interesting, thank you Hans @HDembinski I'll have a closer look this weekend!
For the record, https://godbolt.org/z/M-yu-e , the first example by Hans Dembinski, says:
template <bool Throw>
void maybe_throw() noexcept(!Throw);
unsigned foo(unsigned value) {
++value;
maybe_throw<true>();
--value;
return value;
}
And yields for gcc 9.3 /O3
foo(unsigned int):
push r12
mov r12d, edi
call void maybe_throw<true>()
mov eax, r12d
pop r12
While his second example https://godbolt.org/z/e52Wq3 says:
template <bool Throw>
void maybe_throw(unsigned value) noexcept(!Throw) {
if (Throw) throw value;
}
unsigned foo(unsigned value) {
++value;
maybe_throw<true>(value);
--value;
return value;
}
Which yields:
foo(unsigned int):
push rbx
mov ebx, edi
mov edi, 4
inc ebx
call __cxa_allocate_exception
xor edx, edx
mov esi, OFFSET FLAT:_ZTIj
mov DWORD PTR [rax], ebx
mov rdi, rax
call __cxa_throw
They were able to prove that the increment and decrement have no observable effect and removed them, irregardless of the throw.
Sorry for the delay, @HDembinski But I don't entirely understand what you mean. Do you think GCC can figure out that func(do_throw_exception)
never throws, even when it has an implicit exception specifier (equivalent to noexcept(false)
)? https://github.com/N-Dekker/noexcept_benchmark/blob/v03/lib/inc_and_dec_test.cpp#L24
You see, the argument do_throw_exception
is initialized by a volatile variable (whose value is false), so it should not optimize away this initialization.
https://github.com/N-Dekker/noexcept_benchmark/blob/v03/lib/inc_and_dec_test.cpp
What I meant is if you change my code to
template <bool Throw>
void maybe_throw(unsigned value) noexcept(!Throw) {
if (Throw) throw value;
}
unsigned foo(unsigned value) {
++value;
maybe_throw<true>(1);
--value;
return value;
}
Then the increment and decrement of value are removed whether you throw or not. I was wondering whether the same happens here:
Thanks for your clarification, Hans. But if the func(do_throw_exception)
might throw, then if that would happen, value
would end up being 1. Which would affect the runtime behavior at:
So I don't see how the increment and decrement of value could be removed by GCC, unless they assume that func(do_throw_exception)
never throws.
I guess you are right, feel free to close.
@HDembinski Thanks for allowing me to close the issue, but I still don't allow myself to do so! Not yet! I'm still puzzled by the fact that in this case (my "inc_and_dec_test.cpp" test), I don't see a performance effect from GCC. Even though I do see that the https://godbolt.org assembler output gets somewhat simpler when adding noexcept
in such a case. (Disclaimer: I'm no assembly programmer!)
I'll have a closer look later this week.
Update: I'm now looking at the following simplification of my "inc_and_dec_test.cpp" test:
struct exception {};
namespace
{
inline void throw_exception_if(const bool do_throw_exception)
{
if (do_throw_exception)
{
throw exception{};
}
}
void func(const bool do_throw_exception) OPTIONAL_EXCEPTION_SPECIFIER
{
throw_exception_if(do_throw_exception);
}
}
int test_inc_and_dec()
{
int value = 0;
volatile bool volatile_false = false;
try
{
for (int i = 0; i < 2147483647; ++i)
{
const bool do_throw_exception = volatile_false;
++value;
func(do_throw_exception);
--value;
}
}
catch (const exception&)
{
}
return value;
}
https://godbolt.org/z/eXNk_C (26 lines ASM) is for -O3 -DOPTIONAL_EXCEPTION_SPECIFIER=noexcept
https://godbolt.org/z/f9UWX6 (37 lines ASM) is for -O3 -DOPTIONAL_EXCEPTION_SPECIFIER=
(implicit exception specifier)
Now I'm still figuring out what's going on, exactly. Please let me know if you have any suggestion, how to adjust the test in order to get an observable performance effect from GCC out of adding noexcept
!
Looking at the actual assembly of those two cases, I can still see that the compiler is too clever.
It concluded that incrementing and decremented value
inside the loop is useless in either case. It inlined the body of func
and throw_exception_if
, because those are visible to it. Whether the throw-path is taken boils down to an if
embedded in throw_exception_if
in both cases, this if became embedded in the loop and is the only the only thing executed in the loop. So in the loop, it just checks a gazillion times whether that if
triggers the false branch or not not. The compiler deduced that the result of the function is zero, if it never triggers. It also deduced that the result of the function is 1 if it triggers and you do not use noexcept
on func
. If you apply noexcept
and the throw triggers, then it simply terminates and never returns.
In summary, there is no difference at runtime because the compiler optimized out the increment and decrement in either case.