Feature Request: Setup and Teardown for `benchmark_base`

To implement a Setup, which is shared among all states of a benchmark_base a current solution could be:

void my_benchmark(nvbench::state& state) {
  static int num_execs = 0;
  if (execs == 0) {
    // SETUP calls here
    // e.g. expensive hard disk I/O
    ++execs;
  }
  state.exec([](nvbench::launch& launch) { 
    my_kernel<<<num_blocks, 256, 0, launch.get_stream()>>>(/* uses data from SETUP */);
  });
}
NVBENCH_BENCH(my_benchmark).add_int64_axis("i", {1, 2, 3, 4, 5, ..., 99});

It would be more convenient to have an explicit setup and teardown functionality (similar to e.g. Boost Test). For example:

void my_benchmark_setup() {
  // SETUP calls here
}

void my_benchmark_teardown() {
  // TEARDOWN calls here
}

void my_benchmark(nvbench::state& state) {
  state.exec([](nvbench::launch& launch) { 
    my_kernel<<<num_blocks, 256, 0, launch.get_stream()>>>(/* uses data from SETUP */);
  });
}
NVBENCH_BENCH(my_benchmark).add_int64_axis("i", {1, 2, 3, 4, 5, ..., 99})
  .register_setup(my_benchmark_setup).register_teardown(my_benchmark_teardown);

A discussion about how the setup and teardown registration should look like, would be helpful.

EDIT: Nevermind. I totally overlooked that you wanted this to be shared among all states of the benchmark. The solution I described above would be unique per state.

Manually registered setup/teardown functions are a bit inconvenient and means any state has to be passed through global state/side-effects. I think something akin to fixture classes would be more convenient.

~~Perhaps something like this:~~

struct my_fixture{
   my_fixture(){ /* setup */}
   ~my_fixture(){ /* tear down */}

   auto get_resources(){ /* return whatever resource(s) this fixture owns */}
};


void my_benchmark(nvbench::state& state, my_fixture& fixture){
   auto resources = fixture.get_resources();
   state.exec([](nvbench::launch& launch) { 
    my_kernel<<<num_blocks, 256, 0, launch.get_stream()>>>(resources);
   });
}


NVBENCH_FIXTURE_BENCH(my_fixture, my_benchmark).add_int64_t_axis(...);

~~Internally, when creating the callable wrapper, it can make the callable construct the fixture and pass it to the benchmark function.~~

#define NVBENCH_DEFINE_FIXTURE_CALLABLE(fixture_name, function, callable_name)                       \
  struct callable_name                                                        \ \
  {                                                                          \
  {                                                                            \
    void operator()(nvbench::state &state, nvbench::type_list<>)               \
    {                                                                          \
      fixture_name f{};                                                        \
      function(state, f);                                                      \
    }                                                                          \
  }

To have the setup/teardown shared among all states of the benchmark, the same user-facing interface would work. Internally, instead of constructing the fixture before each benchmark invocation, it can be constructed before the iteration through the states of the benchmark

nvbench/nvbench/runner.cuh

Line 97 in 610b776

for (nvbench::state &cur_state : states)

auto fixture = benchmark_type::fixture_type{};
for (nvbench::state &cur_state : states) {
   ...
   if constexpr( std::is_same_v<benchmark_type::fixture_type, no_fixture> )
      kernel_generator{}(cur_state, type_config{});
   else
      kernel_generator{}(cur_state, type_config{}, fixture);
}

nvbench/nvbench/benchmark.cuh

Lines 53 to 54 in ff50759

    
           template <typename KernelGenerator, typename TypeAxes = nvbench::type_list<>> 
        
           struct benchmark final : public benchmark_base

This would likely require modifying the benchmark type to add an additional template parameter for the FixtureType that can be defaulted to a sentinel tag type:

struct no_fixture{};
template <typename KernelGenerator, typename TypeAxes = nvbench::type_list<>, typename Fixture = no_fixture>
struct benchmark{
   using fixture_type = Fixture;
   ...
}

	template <typename KernelGenerator, typename TypeAxes = nvbench::type_list<>>
	struct benchmark final : public benchmark_base