oneapi-src/oneTBB

Status of tbb::zero_allocator?

dan131riley opened this issue · 9 comments

tbb::zero_allocator has disappeared from tbb_allocator.h but it still appears in the oneAPI spec and I don't see any changelog or release notes saying that it was removed. What's the status?

It seems that zero_allocator was removed by mistake.

However :

  • it seems that the only use case for it is the "waiting for element construction" in the concurrent_vector (as described in the developer guide), which in some cases might lead to Undefined Behavior in C++ and need to be better thought out / described
  • It is relatively small effort for user to implement it by themselves

So deeper consideration is needed whether zero_allocator should be a part of oneTBB package.
So at the moment it's status is not clear

Any news on that? In our research databases Hyrise, we also use the zero_allocator, which is not available in the latest oneTBB version.

@mweisgut can you share more details on your use case of zero_allocator ?

In our case, we use a tbb::concurrent_vector<std::shared_ptr<T>, tbb::zero_allocator<std::shared_ptr<T>>>. This vector is accessed by multiple threads. The objects T that the shared pointers of the vector are pointing to have to be accessed atomically in our use case.

tbb::concurrent_vector does not guarantee that elements reported by size() are fully initialized.
Based on an old intel blog post from @anton-malakhov and the assumption that tbb::concurrent_vector does not guarantee that elements reported by size() are fully initialized yet, we followed the proposed solution of the blog post:

The main idea is to detect which item is constructed by checking the item itself. First of all, you need stable state of the memory allocated for new items. For that, you could use zero-filling allocator or write an adapter that cleans the memory for existing allocator yourself. Then, you need a data type that is able to work starting with zero-filled memory instead of depending upon its constructor to run. For example, you could add a flag variable that is set to 1 at the end of constructor (remember to define it as atomic<> to get right fences and operation order). Another way is to use a pointer to the real data. If the pointer equals to zero, the element is not constructed surely. But when construction is done, you can store the pointer to make it visible for monitoring threads.

Later on, the post was extended:

Since TBB 2.2, you can use tbb::zero_allocator as described in the blog.

Thus, to avoid reading an incomplete shared_ptr<T>, we use the zero_allocator for the concurrent_vector making sure that an uninitialized entry compares equal to nullptr.

The objects T that the shared pointers of the vector are pointing to have to be accessed atomically in our use case.

I believe you have data race here because std::shared_ptr<T> is not thread-safe when accessed through non-const member function (consider std::atomic(std::shared_ptr)). However, even with std::atomic(std::shared_ptr), it is a grey area because accessing (even atomically) non-constructed shared_ptr (even located in zero filled memory) does not seem fully complained with C++ standard (e.g. are we are sure that zero filled memory means empty shared_ptr?).

I believe it is still possible to create such a thread-safe std::optional-like class, which will not be in that gray area

I believe it is still possible to create such a thread-safe std::optional-like class, which will not be in that gray area

We failed to found simple optional-like solution in pre-C++20 without changes in concurrent_vector or using external synchronization mechanisms.

The fundamental issues is: "Can you access object that is not still initialized with a constructor?". Perhaps, it is Ok (grey area?) to access non-constructed objects that have trivial constructor; however, starting with C++20 the default constructors of std::atomic and std::atomic_flag are not trivial.

The only resort seems std::atomic_ref (starting with C++20); however, it works only with trivially constructed types (e.g. native types). I.e. we can try to create and access initialized with atomic_ref:

template <typename T>
struct MyThreadSafeOptional {
    /* atomic */ alignas(std::atomic_ref<T>::required_alignment) bool initialized;
    union {
        T t;
    };
    // some methods to properly initialize and access `t`
}

It's not hard to write your own zero_alocator. For example,

template <typename T>
class zero_allocator : public tbb::cache_aligned_allocator<T> {
 public:
  using value_type = T;
  using propagate_on_container_move_assignment = std::true_type;
  using is_always_equal = std::true_type;

  zero_allocator() = default;
  template <typename U>
  explicit zero_allocator(const zero_allocator<U>&) noexcept {};

  T* allocate(std::size_t n) {
    T* ptr = tbb::cache_aligned_allocator<T>::allocate(n);
    std::memset(static_cast<void*>(ptr), 0, n * sizeof(value_type));
    return ptr;
  }
};

@dan131riley is this issue still relevant?