smol-rs/async-channel

`Recv` can lock on dropping

daxpedda opened this issue · 4 comments

I encountered this while testing stuff on Wasm (cleaned stacktrace):

TypeError: waiting is not allowed on this thread

std::sync::mutex::Mutex<T>::lock
event_listener::sys::<impl event_listener::Inner<T>>::lock
event_listener::sys::<impl event_listener::Inner<T>>::remove
event_listener::_::<impl core::ops::drop::Drop for event_listener::InnerListener<T,B>>::drop::__drop_inner
event_listener::_::<impl core::ops::drop::Drop for event_listener::InnerListener<T,B>>
core::ptr::drop_in_place<event_listener::InnerListener<(),alloc::sync::Arc<event_listener::Inner<()>>>>
core::ptr::drop_in_place<alloc::boxed::Box<event_listener::InnerListener<(),alloc::sync::Arc<event_listener::Inner<()>>>>>
core::ptr::drop_in_place<core::pin::Pin<alloc::boxed::Box<event_listener::InnerListener<(),alloc::sync::Arc<event_listener::Inner<()>>>>>>
core::ptr::drop_in_place<event_listener::EventListener>
core::ptr::drop_in_place<core::option::Option<event_listener::EventListener>>
core::ptr::drop_in_place<async_channel::RecvInner<()>>
core::ptr::drop_in_place<event_listener_strategy::FutureWrapper<async_channel::RecvInner<()>>>

Which I think boils down to:

  1. Recv holding RecvInner
  2. RecvInner holding Option<EventListener>
  3. EventListener holding InnerListener
  4. InnerListener drop implementation calling std::Inner::remove()
  5. std::Inner::remove() calling std::inner::lock()
  6. std::inner::lock() calling Mutex::lock()

I think in this case spinlooping makes sense when running on Wasm?
Let me know if this issue should be moved to event-listener instead.

Yes, the inner mechanism for the EventListener is a Mutex, which means that dropping a Recv will temporarily lock the inner Mutex so it can update its internal state. If this is happening on a single-threaded WASM context, this shouldn't be happening and it's a bug. If it's happening in web workers, it's expected.

One option is to switch Web to event-listener's lock-free backend, although it's significantly slower.

If this is happening on a single-threaded WASM context, this shouldn't be happening and it's a bug. If it's happening in web workers, it's expected.

No, this happened in multi-threaded Wasm context, but still on the main thread and not in a Web worker.
Was it assumed that multi-threaded Wasm means that its always running in a Web worker?

I can't seem to find the code in async-channel or event-listener guarding this stuff behind Wasm or atomics, where exactly does this happen?

One option is to switch Web to event-listener's lock-free backend, although it's significantly slower.

Dynamic detection would be nice, but that would require depending on wasm-bindgen.

I can't seem to find the code in async-channel or event-listener guarding this stuff behind Wasm or atomics, where exactly does this happen?

The web code just uses the normal code, I think.

Dynamic detection would be nice, but that would require depending on wasm-bindgen.

Eh, I'd rather just switch into "lock-free" mode on cfg(all(target_arch = "wasm", not(target_os = "wasi"))).