Doubt about Return Value Optimization section
Opened this issue · 5 comments
Hi, I am not convinced by the code section about NRVO:
std::optional<std::string> make_heavy_object_mutable() {
std::string x(1024, 'x');
return x;
}
std::optional<std::string> make_heavy_object_immutable() {
std::string const x(1024, 'x'); //! `const` is the only difference
return x;
}
static void rvo_friendly(bm::State &state) {
for (auto _ : state) bm::DoNotOptimize(make_heavy_object_mutable());
}
static void rvo_impossible(bm::State &state) {
for (auto _ : state) bm::DoNotOptimize(make_heavy_object_immutable());
}It states that the const prevents NRVO, but cv qualifications actually don't inhibit it (https://timsong-cpp.github.io/cppwp/n4659/class.copy#elision-1.1). I think this test isn't checking RVO at all, as the return type is actually std::optional, not std::string. What the const inhibits is moving the string inside the optional, and instead it forces the use of the copy constructor (that has to perform a memcpy).
The behavior would likely depend on the compiler version and flags like the -fno-elide-constructors. There are probably better examples, that would have a more consistent behavior. Let's think 🤔
I still don't think that it's testing what it says it's testing.
Godbolt example: https://gcc.godbolt.org/z/8Md39nnd5
Yes, I also don't like that part. I've now replaced the "optional string" with a heavy custom object with "sleep" calls in constructors, but still looking for a better set of examples for RVO.
I think that's a valuable example of its own. It's just not copy elision but move inhibition.
A NRVO inhibition is eg return std::move(x); (forces move instead, so anything with expensive move will do)
How about this?
struct heavy_t {
std::uint64_t data[8];
heavy_t() noexcept { std::iota(data, data + 8, 0); }
heavy_t(heavy_t &&) { std::this_thread::sleep_for(std::chrono::milliseconds(1)); }
heavy_t(heavy_t const &) { std::this_thread::sleep_for(std::chrono::milliseconds(2)); }
heavy_t &operator=(heavy_t &&) {
std::this_thread::sleep_for(std::chrono::milliseconds(1));
return *this;
}
heavy_t &operator=(heavy_t const &) {
std::this_thread::sleep_for(std::chrono::milliseconds(2));
return *this;
}
};
heavy_t make_heavy_object() { return heavy_t {}; }
heavy_t make_named_heavy_object() {
heavy_t x;
return x;
}
heavy_t make_conditional_heavy_object() {
heavy_t x;
heavy_t &x1 = x;
heavy_t &x2 = x;
static std::size_t counter = 0; //! Condition prevents RVO
if (counter++ % 2 == 0) { return x1; }
else { return x2; }
}
static void rvo_trivial(bm::State &state) {
for (auto _ : state) bm::DoNotOptimize(make_heavy_object());
}
static void rvo_expected(bm::State &state) {
for (auto _ : state) bm::DoNotOptimize(make_named_heavy_object());
}
static void rvo_banned(bm::State &state) {
for (auto _ : state) bm::DoNotOptimize(make_conditional_heavy_object());
}
BENCHMARK(rvo_trivial);
BENCHMARK(rvo_expected);
BENCHMARK(rvo_banned);-------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------
rvo_easy 0.634 ns 0.634 ns 851954378
rvo_expected 0.640 ns 0.640 ns 1115473320
rvo_banned 2060564 ns 6039 ns 10000