parallel-runtimes/lomp

Use C++17 aligned allocators

Closed this issue · 11 comments

The runtime currently uses its own aligned allocators, however C++17 has support in the standard for aligned allocators. It would therefore make sense to use the standard support, rather than having our own.

I have done something about this using the aligned allocation function from C++17. Did that already resolve this issue?

Yes, I reckon so, provided your fix caught all of them!

That's indeed a good question. I actually do not know, since this issue does not list specific examples of what should be replaced. :-)

Which branch is this fix in? I still see a call to posix_memalign in barrier_impl.cc in the "main" branch...

Which branch is this fix in? I still see a call to posix_memalign in barrier_impl.cc in the "main" branch...

That's in the current main branch via commit 0bac3d5.

Looks good to me; if we find more we can always handle them as we see them.
The one thing that still looks a bit weird to me is the use of malloc and free in task allocation...

Yes, we should get rid of that. I would suggest introducing a memory manager that at the moment maps to the standard allocation routines of C++. We can then think of later replacing it with something cooler/smarter.

Even just switching from malloc/free to the C++ aligned allocator aligning at sizeof(void *) would probably be an improvement. And better than something grievous like allocating a new array of some suitable type and then casting a pointer to element zero.

Another approach might be to make initializeTaskDescriptor a placement new function. The pairing of task allocation and initialisation occur together AFAICS (and only twice). Though you'd still need to allocate the space, so that's really just syntactic sugar.

Even just switching from malloc/free to the C++ aligned allocator aligning at sizeof(void *) would probably be an improvement. And better than something grievous like allocating a new array of some suitable type and then casting a pointer to element zero.
Another approach might be to make initializeTaskDescriptor a placement new function. The pairing of task allocation and initialisation occur together AFAICS (and only twice). Though you'd still need to allocate the space, so that's really just syntactic sugar.

Well, the bad thing here is that __kmp_omp_task_alloc has do to the allocation of the memory needed to store the task and it's data and then __kmpc_omp_task can store it in the task pool. I'm thinking about a more C++y way of allocating memory; still I'm thinking that we should refactor all memory allocation code to call into a LOMP utility that for now does a regular memory allocation, but we could then add our own educational memory allocator later.

I realise that this task case is hard because the space needed is not compile time known, but one could still use something more C++ish than malloc. For everything else, we could probably derive from a base class that provided operators new and delete so the changes to classes would only be in their definition.

Yes! Indeed, and I fully agree that the usage of malloc/free is particularly ugly and should be replaced.