Question about looping for read_events_ and write_events_
npuichigo opened this issue · 4 comments
Could anyone help to explain why to use a loop for read_events_
here.
if (read_events_.fetch_add(1, std::memory_order_acquire) == 0) {
RefPtr self_ref(ref_ptr, this);
do {
auto rc = OnReadable();
if (FLARE_LIKELY(rc == EventAction::Ready)) {
continue;
} else if (FLARE_UNLIKELY(rc == EventAction::Leaving)) {
FLARE_CHECK(read_mostly_.seldomly_used->cleanup_reason.load(
std::memory_order_relaxed) != CleanupReason::None,
"Did you forget to call `Kill()`?");
GetEventLoop()->AddTask([this] {
read_events_.store(0, std::memory_order_relaxed);
QueueCleanupCallbackCheck();
});
break;
} else if (rc == EventAction::Suppress) {
SuppressReadAndClearReadEventCount();
break;
}
} while (read_events_.fetch_sub(1, std::memory_order_release) !=
1);
QueueCleanupCallbackCheck();
});
We can only get read_events_ == 1
after the if condition, so is there any special case that looping to decrease read_events_
is necessary?
You oversimplified the code.
Line 176 starts a fiber to run OnReadable
asynchronously. This is done to parallelize handling of I/O events. However, this also introduces a side effect that read_events_
can be increased concurrently, after OnReadable
draining the read buffer, but before Line 222 is reached.
Since a fiber is created and detached after increasing read_events_
from 0 to 1 and L176 is the only line to increase read_events_
, could u point out which line may cause concurrent increase of read_events_
? Do u mean OnReadable
in a fiber may also start parallel invocation to this function?
Note that epoll
event can be triggered any time soon after the fd
is drained (edge triggered), even before OnReadable
has an opportunity to return.
IOW, there is a time window between returning EAGAIN
from syscall read
(inside OnReadable
), and decrementing read_events_
. If new data arrived in this time window, without checking event count we may lose the notification.
Consider this:
F1: Received EPOLLIN
, starts F2 to handle it,
F2: Done reading data, is either calling user’s callback or about to return, and got paused (due to whatever reason, e.g., interrupted by system),
F1: Another EPOLLIN
comes, now either:
- We start a new fiber, which would complicate thing a lot, as now the data stream is concurrently handled by two fiber, each with a part of the data (think about what to do if they both contain a portion of a large packet), or
- We’d miss the notification as
OnReadable
is about to return, and no one else would read the data.
Yes you’re right the only place that can increment the event count is L176, but it’s exactly that line that helps us to avoid missing the notification.
@0x804d8000 Thanks for your detailed explanation. Learn a lot after reading the implementations.