chriskohlhoff/asio

ASIO provides no way to inject into pre-suspend/post-resume awaitable behavior or coroutine_specific_ptr

corporategoth opened this issue · 10 comments

I need to replicate something similar to thread-local storage in a project using coroutines.

Unfortunately, the awaitable class does not provide a way to do something prior to suspend in awaitable::await_suspend or post-resume in awaitable::await_resume()

Of course, the ideal would be able to have something like boost::thread::thread_specific_ptr that instead was more coroutine_specific_ptr instead. Such that each awaitable_thread ends up with a different instance - so if I store something in the coroutine_specific_ptr, then what was in it before a co_await is also in it after co_await, but each coroutine stack could have different versions.

But lacking this, if I can inject into the await_suspend and await_resume I can implement a poor-man's version of this, by saving something from my thread_local somewhere in await_suspend, then restoring it on await_resume.
I started this by making my own version of awaitable, which then could contain the value I want to store. Unfortunately, this breaks down very quickly, as co_spawn won't take my awaitable, and I also had to create my own version of awaitable_frame (inherited from the asio one) to add an await_transform for my awaitable, and also add coroutine_traits, etc. ie. it just gets messier.

The ideal would be some kind of coroutine_specific_ptr that is part of asio, that could then more or less have it's backing store in the awaitable_thread, so that it could always just address the right one.

Side note - it would be REALLY nice to have an is_running_in_coroutine() free function. And even a coroutine_identifier function that basically identifies the coroutine thread. Either of these would have made hacking together some kind of coroutine_local_ptr without modifying awaitable_thread (ie. external to asio) easier.

I wrote my own version of an awaitable_specific_ptr - #1496

Chris decided to make the awaitable a "walled of garden" on purpose, because it has strict requirements so that it can be used in multi-threaded environments.

boost.cobalt (I am the author) only requires single-threading and everything else is pretty much open. I'd be happy to discuss ideas if that fits your needs.

@klemens-morgenstern awaitable_specific_ptr works in a multithreaded environment. And my environment IS multithreaded. So there is no reason that awaitable_specific_ptr should not work (it works in my tests).

But even besides that - having access to some kind of coroutine_id (based on an ID from the lowest frame) would also be extremely helpful to be able to implement something outside of asio as well.

I am not about to switch a large commercial code base from a multi-threaded system to a single-threaded system, I would rather provide the enhancements to asio that it needs.

Different idea: Couldn't you create a custom executor per co_spawn that provides the local storage?

@klemens-morgenstern Without a way to get some kind of unique coroutine-specific ID, there is no way to make coroutine-local-storage.

There is a reason I had to write the awaitable_specific_ptr - it creates a few await_transform calls that allow it to work. Without either that (which one cannot do arbitrarily - since they are part of the awaitable_frame class, and you can't just derive that class and use it seamlessly with the rest of asio), or some kind of ID I can use externally, there is no way to make something that acts like a thread_specific_ptr - ie. something you can access within the same coroutine, without passing it everywhere.

That said, I don't know enough about executors, and how they're implemented to answer your question. But from what I can tell, even the executor context that is used by the executor won't help much? It still doesn't get a unique coroutine-specific ID (ie. the bottom frame or entry point awaitable) that it could use to differentiate different coroutine threads of execution. But besides that, implementing my own executor seems a little extreme?

But to answer your question, what I am trying to do is replicate thread_specific_ptr as coroutine_specific_ptr - so that I can store different values for each thread of execution (which is actually a frame stack, starting with the bottom frame from the co_spawn). I could do it completely externally if I could get some kind of coroutine ID that links back to the bottom frame, but that doesn't exist. But the MR I attached adds the function natively.

Right, but I am not sure this functionality will be added to asio. I actually doubt it, as it's relying on an implementation detail of awaitables. I've written up a minimal solution using an executor here: https://gcc.godbolt.org/z/EvPjz4c5n

The advantage of that is that you could add your id (or value storage) to any other composition, e.g. stackful coroutines or async_compose, too. And you could share it too.

@klemens-morgenstern That doesn't work alongside existing code that returns a standard awaitable (ie. everything in ASIO).

For example:

boost::asio::awaitable<void> bar() {}
my_awaitable<void> foo() {
   co_await bar();
}

boost::asio::co_spawn(executor_with_id(ioc.get_executor(), 1), &foo, boost::asio::detached());

The above will fail, because the await_transform that is invoked (on the awaitable_frame) doesn't know what to do with an awaitable using a different executor.

No, if you consider instead of calling bar(), my foo() needed to cal boost::asio::async_read(socket, buffer, boost::asio::use_awaitable) - then you see the solution you proposed falls down quickly if I want to use the rest of asio. Though I guess I could just make my own typedef of use_awaitable_t and add use_my_awaitable.

I DO have a solution that works, and it does involve a custom executor, but it relies on invoking the boost::asio::any_io_executor's target() function (ie. exec->target<my_executor>() ) to get back to my own executor with storage in it. But this is also kind of cumbersome and annoying to use. AND it still requires me to modify every co_spawn() call to use this custom executor.

That would be like requiring every std::thread() call to need to invoke std::my_thread to start a thread in order to get the thread_specific_ptr. The point of thread_specific_ptr is it's automatic. Which is what I was trying to do with my MR.

Not only that, the executable_specific_ptr ALSO requires you to more or less pass it the executor along with the get() for it to work, as opposed to the awaitable_specific_ptr, which just works (tm) because the await_transform already has the required piece of information (ie. the lowest frame in the stack) that it needs to work.

It's one thing to say this CAN be implemented without modifying asio, but the amount of gymnastics involved to do so is ridiculous. Especially since the MR to add this is so small, and it just brings some level of parity for coroutines/awaitables and threads (which they are somewhat analogous to).

So I am hoping that Chris will actually add my MR.

Well, I got one other hacky idea: take the address of the result of co_await asio::detail::awaitable_thread_has_context_switched{} to identify the existing awaitable_thread.

Good luck with your PR.

@klemens-morgenstern Unfortunately, that awaitable returns a bool.

As for my PR relying/modifying the details - since it's a PR for ASIO itself, that's acceptable I think :)