lhmouse/asteria

Remove unnecessary null pointer checks

elfring opened this issue · 27 comments

The check is needed to prevent an unnecessary function call in the case of GCC 8- when the argument is null.
GCC 9+ and Clang don't suffer from this issue.

[Note: This reply is superseded. See this reply at the bottom.]

  • Will a build configuration check help to include extra source code only on concrete demand?
  • Can the use of smart pointers help at the affected places?
  1. No don't suggest configuration checks. They mess code up and bring nothing.
  2. Yes, but I don't use smart pointers for LLDS (which stands for Low-Level Data Structure). Adoption of unique_ptrs requires a custom deleter, which messes things up too, and the deleter can only deallocate memory without destroying buckets (they will have been destroyed at that point; note every bucket is immediately destroyed as soon as it is move-constructed) which introduces nothing but inconsistency.
  • Specific configuration checks are usually supported by the selected higher level build system.
  • Can any of the statements in the affected function implementations throw a C++ exception before calling the delete operator at the end?
  1. We don't depend on configure macros other than PACKAGE_NAME etc. which you can define yourself if you like.
  2. No. Reference and Value rely heavily on reference counting so their copy and move constructors throw no exceptions, just like std::runtime_error and std::locale.

The check is needed to prevent an unnecessary function call in the case of GCC 8- when the argument is null.
GCC 9+ and Clang don't suffer from this issue.

This is considerably a style bug because it is confusing, as a deallocation function can never have such a narrow contract by design. The verbosity pollutes the client code with irrelevant implementation details. The discrimination on particular versions of the compilers also leaks the implementation details of C++. If the artificial check is needed, it should be implemented as a forced inline helper function elsewhere. No deallocation functions should be directly called then.

The check is needed to prevent an unnecessary function call in the case of GCC 8- when the argument is null.
GCC 9+ and Clang don't suffer from this issue.

This is considerably a style bug because it is confusing, as a deallocation function can never have such a narrow contract by design. The verbosity pollutes the client code with irrelevant implementation details. The discrimination on particular versions of the compilers also leaks the implementation details of C++. If the artificial check is needed, it should be implemented as a forced inline helper function elsewhere. No deallocation functions should be directly called then.

You guys are always resolving issues by adding another level of indirection, for what? There is no need to invent such wrappers. The check can eliminate a no-effect, superfluous and unnecessary call. The check does the right thing. The check does not degenerate in case of new compilers. So it is the simplest solution. There will be no change, period.

You call destructors on objects whose (possibly delegated) constructors have finished normally; you don't destroy objects that don't exist. Likewise, you don't deallocate memory blocks that haven't been allocated at all. Why should null pointers be passed to ::operator delete()? And why would you expect it to have no effect, just because you don't wanna do the check yourself? Yes yes the C++ standard says so, and the only reason why the C++ standard says so is that C++ is a stupid language designed by stupid people and as a consequence cannot be made non-stupid.

You guys are always resolving issues by adding another level of indirection, for what? There is no need to invent such wrappers. The check can eliminate a no-effect, superfluous and unnecessary call. The check does the right thing. The check does not degenerate in case of new compilers. So it is the simplest solution. There will be no change, period.

Because the route of relying on implementation details to make the result well-behaved is already enforcing a new level of indirection to the readers of the code. By nature, this is far more thicker than any new indirection layers in the foreseeable changes on the code.

The mentioned code generation behavior is QoI-related, lack of documentation, and suspectedly unstable, since it totally goes against the semantic rules of the language in the very beginning point. Sure, we don't rely on the abstract machine semantics that often; but you, as a C++ user, are assumed to work like that by default, by compiler authors and language designers. That's also the FAQ suggest to follow.

And there can be no exhaustive tests to ensure every supported build configurations sane as you expect. Note that even if the numbers of supported compiler versions are limited, you don't specify the triplets.

Further, no one can effectively guarantee that bypassing a call is absolutely (or actually) better due to icache pressure and branch predictor states, etc. That said, the expected implementation behavior itself is suspected unnecessary without benchmark results.

So, this is merely a hack all around, before you document such facts explicitly. The current way is a totally inferior solution to make things understandable. This is never simple.

You call destructors on objects whose (possibly delegated) constructors have finished normally; you don't destroy objects that don't exist. Likewise, you don't deallocate memory blocks that haven't been allocated at all. Why should null pointers be passed to ::operator delete()? And why would you expect it to have no effect, just because you don't wanna do the check yourself? Yes yes the C++ standard says so, and the only reason why the C++ standard says so is that C++ is a stupid language designed by stupid people and as a consequence cannot be made non-stupid.

I'm not arguing the stupidity of the API you are using. This is a minor but orthogonal problem compared to the one exposed from your arguments here. I never rely on the unnecessary wide contract myself, but there are cases the stupidity cannot be avoid, e.g. forwarding the call to others' code, which legitimately relies on the guarantee. Assuming a documented interface with different preconditions is equally (if not more) stupid, that's why there need another levels of abstraction to work around: just ensure the "static" stupidity disappear in all but one place in the code. (OTOH, performance is off the topic.)

Further, no one can effectively guarantee that bypassing a call is absolutely (or actually) better due to icache pressure and branch predictor states, etc. That said, the expected implementation behavior itself is suspected unnecessary without benchmark results.

So you presume that it might have better performance by having the branch inside the operator delete() implementation, rather than outside of it? You would still have an extra call, neither anything less, nor anything simpler, nor anything better.

(OTOH, performance is off the topic.)

Well enough, please STFU. By having an unnecessary function call that has no effect (I am not saying GCC is non-conforming. It indeed has no effect but the call itself still has overhead which I tend to eliminate) slows down the first definition (of variables and named functions) in each lexical scope by about 1% to 2%. It is worthy.

So you presume that it might have better performance by having the branch inside the operator delete() implementation, rather than outside of it? You would still have an extra call, neither anything less, nor anything simpler, nor anything better.

Nope. That paragraph just shows the logical flaws in your reasoning. Such property is unspecified in all kinds. I don't assume it would be better or worse; either is consistent with my expectation by default, which is also the de facto treatment of most C++ users (including the language designers and the compiler authors).

Technically, your reasoning is still false.

  • If the global ::operator delete is inlined and inter-procedural optimizations are effectively performed, it should have the same time efficiency, at least in this local context. But depending on the IR and the code generation algorithms, the branch outside can make more resource pressure to the transformers when optimizing the code outside the call (e.g. in RTL-related transformation before any concrete linear low-level IR is determined), so other code can be less optimized under the same configuration for the optimizer.
  • There are suboptimal microarchitecture niches. The trade-off between a branch and a call is highly non-deterministic, particularly in hosted environments where can exist the inexact but similar shared hot code.
  • And with current implementations, (unless you have redefined a nonforming ::operator delete) you have 2 branches in the suggested "taken" path, rather than 1.

(OTOH, performance is off the topic.)

Well enough, please STFU. By having an unnecessary function call that has no effect (I am not saying GCC is non-conforming. It indeed has no effect but the call itself still has overhead which I tend to eliminate) slows down the first definition (of variables and named functions) in each lexical scope by about 1% to 2%. It is worthy.

The analysis based on C++ semantics and the raw source code already shows the branch test unnecessary and there is no distinctions on numbers of calls in the C++ parlance. Performance is irrelevant as it is not a static property of the source, and is even not provable to have a big difference with many practical (and fair) implementation-dependent assumptions, so normally C++ users don't care. For ones who have assumptions beyond such, its their turn to prove it so-called "unnecessary". As pointed, you can't implement this without enumerate all supported build configurations, which are presumably infinite.

Well enough, please STFU. By having an unnecessary function call that has no effect (I am not saying GCC is non-conforming. It indeed has no effect but the call itself still has overhead which I tend to eliminate) slows down the first definition (of variables and named functions) in each lexical scope by about 1% to 2%. It is worthy.

If you believe "about 1% to 2%" is effective in general, you should have the benchmark result. Document it. Refer to the reasoning and discussion here. Alternatively, at least you'd add a note (as the comment) in the source to point this is intentionally hot and the (C++ source level) redundant test should be unconditionally reserved. (I still tag the latter way inferior because this likely requires many duplicate comment everywhere, which is a disaster for anyone but probably you.)

It is worthy.

🔮 Which statistics do you expect (or observe) for the distribution of passing null pointers and deleting valid objects by a variable like “bold”?

🔮 Which statistics do you expect (or observe) for the distribution of passing null pointers and deleting valid objects by a variable like “bold”?

What? If you ever declare some local variables or functions (including function parameters and references in a for each loop) then upon the first rehash it is definitely null. It may be non-null if there's too many such things that the table has to be reallocated. ATM we allocate 17 buckets at the first call which is sufficient for 8 references, so this'd happen only when you declare more distinct names than those.

Oops, the original snippet was about the delete operator. So it is horribly mistaken.

The relevant example is here. We are not talking about the operator, but the function ::operator delete(void*), which cannot be inlined. Hence the check is not unnecessary, not at all.

Hence the check is not unnecessary, not at all.

Why do you seem to disagree still to information from “frequently answered C++ questions”?

I would expect the effect of an immediate return from the function call “delete operator” with a passed null pointer.
The corresponding run time characteristics can vary then according to various factors.

There ain't gonna be such a change.

I would appreciate if another bit of redundant source code can be omitted.

Please use Pull Requests for this.

The relevant example is here. We are not talking about the operator, but the function ::operator delete(void*), which cannot be inlined. Hence the check is not unnecessary, not at all.

Wrong. There is also no guarantee that the function call is out-of-line in any cases.

You must have case-by-case analyses depending on the call sites and the implementations you use. It is simply not worth being worked around in every call sites. Even if needed, it should be worked around globally, and the code here is not at the right place. Again, whether you are talking the deallocation function or the delete-expression, whether it is better for performance or not, the current way is neither logically correct nor simple.

The only visually semantic distinction in C++ here is the delete-expression may have something to do with the new merging optimization since C++14. Neither of the implementation supported here has clearly guaranteed it can do such optimization in such case, and it indeed can't do it in these cases because the deallocation is not paired with the allocation within a new-expression which is allowed to call the global replaceable allocation function. Hence there is no more differences about the overhead analysis based on the source code between the cases of operator delete and delete-expression.

Note it is explicitly mentioned that the allocation can be replaceable. Nevertheless, it is still implementable without C++14 rules. To implement such optimizations, the implementations must (at least) have the ability to detect whether the allocation function is replaced in practice. This should be similar to the cases of the global replaceable deallocation function. In other words, such implementations should usually have the whole definition of the global replaceable deallocation function known and represented in IR form internally during optimization until it is replaced, namely, it has been effectively inlined. The new rules allow to bypass the limitations, but they still don't reject inlining.

No, either you resolve the issue, or you shut up. If you don't propose a solution that does not uglify code then there will be no change. This injunction is final.

This issue is about the already uglified C++ source code. The obviously simple solution is exactly the same as the title.

This issue has originally nothing to do with the generated code. People who want to enforce the unreliable behavior of code generation should better create another to avoid further off-topic discussions (and possibly make this issue depend on that). Period.