WFME( waitAll = true ) resets AutoReset Events before all Events are set

Question

WFME( waitAll = true ) resets AutoReset Events before all Events are set

qwertymaster617 opened this issue 4 years ago · 5 comments

Since pevents is supposed to be a WFMO clone, I'm not sure if this is a bug report or a feature request, because the implementation just doesn't do this right now.

From the Remarks section of the documentation for WaitForMultipleObjects on MSDN (emphasis mine):

When bWaitAll is TRUE, the function's wait operation is completed only when the states of all objects have been set to signaled. The function does not modify the states of the specified objects until the states of all objects have been set to signaled. For example, a mutex can be signaled, but the thread does not get ownership until the states of the other objects are also set to signaled. In the meantime, some other thread may get ownership of the mutex, thereby setting its state to nonsignaled.

This means that during a WFME call where waitAll = true, the state of events shall not be reset, in the specific case of auto-reset Events, until all are simultaneously in the set state. This would apply to manual-reset Events too, except that their state cannot change due to a WaitForEvent/WFME call unless an explicit call to ResetEvent is made. Note that even though WMFE( waitAll = true, milliseconds = UINT64_MAX ) shall return only if all events are signaled, this bug applies to waitAll = true calls to WFME with any timeout value because the state shall only change once all events are in the set state, and the current implementation immediately resets the event before waiting if the Event is set.

This seems to be a significant change (hence, a feature request?) simply because the wait state is never added to any Event that is already in the set state, which is incorrect because another thread could call ResetEvent on any already-set Event, change the state back to not-set, and thus should increment the EventsLeft shared state member.

The following test case exhibits the error:

neosmart::neosmart_event_t lEvents[3];
lEvents[0] = neosmart::CreateEvent( false, true );  // Already Signaled AutoReset
lEvents[1] = neosmart::CreateEvent( false, false ); // Not Signaled AutoReset
lEvents[2] = neosmart::CreateEvent( false, true );  // Already Signaled AutoReset

// WFMO is non-destructive if a wait-all with any timeout value fails on auto-reset events.
if ( neosmart::WaitForMultipleEvents( lEvents, 3, TRUE, 0 ) == WAIT_OBJECT_0 )
    throw std::runtime_error( "Must not be signaled!" );

// FAILS!!
if ( neosmart::WaitForEvent( lEvents[0], 0 ) != WAIT_OBJECT_0 )
    throw std::runtime_error( "Must be signaled" );

if ( neosmart::WaitForEvent( lEvents[1], 0 ) != WAIT_TIMEOUT )
    throw std::runtime_error( "Must not be signaled" );

// FAILS!!
if ( neosmart::WaitForEvent( lEvents[2], 0 ) != WAIT_OBJECT_0 )
    throw std::runtime_error( "Must be signaled" );


// WFMO is destructive if a wait-all succeeds with any timeout value on auto-reset events.
for ( auto& lEvent : lEvents )
    neosmart::SetEvent( lEvent );
if ( neosmart::WaitForMultipleEvents( lEvents, 3, TRUE, 0 ) != WAIT_OBJECT_0 )  // OK
    throw std::runtime_error( "Must be signaled!" );
for ( auto& lEvent : lEvents )
{
    if ( neosmart::WaitForEvent( lEvent, 0 ) != WAIT_TIMEOUT ) // OK
        throw std::runtime_error( "Must not be signaled" );
    neosmart::DestroyEvent( lEvent );
}

Answer 1 · 2020-09-04T03:29:33.000Z

Hi, thanks for the PR. I agree that it would be best to mimic the Win32 behavior here, both for conformity but especially because code depending on those guarantees might not (or more likely, would not) work without them.

Do you have suggestions on how this can be cleanly implemented with the current approach?

Answer 2 · 2020-09-04T05:04:23.000Z

Yes. With the current implementation, the solution is 6-fold. There are opportunities for optimizations, but the general idea is as follows:

Each event must contain a reference to the shared state, regardless of whether or not the event has been signaled at the time of the WFME call.
Each event must contain knowledge of whether or not it has already signaled the shared state.
When ResetEvent is called, in addition to resetting its own state, for all still-waiting registered shared states the event must increment the EventsLeft member if said event has already signaled the state, effectively resetting its prior signaling of the shared state.
When SetEvent is called, the event may only signal a shared state if it has not already done so.
The event may only discard a shared state if the waiter no longer exists or the shared state has been signaled, i.e. discarding a shared state during SetEvent if the event has just decremented the shared state is no longer sufficient.
When the WFME call is in the signaled state, regardless of whether or not it waited, then WaitForEvent( timeout = 0 ) may be called on all events prior to returning a successful wait result.

Given the current implementation along with the above modifications:

When calling ResetEvent, as long as knowledge is held that all shared states have or have not been set by a given event, subsequent calls to SetEvent will still properly signal the shared state.

The beauty of calling WFE( 0 ) on all events in sequence when WFME has been signaled is that you don't need to keep track of whether the event is auto or manual reset, as each event already knows whether or not it must be reset on successful wait. You also don't need to bother holding locks as is necessary when checking the state of each event and adding the wait states, because all you need to do is attempt to reset auto reset events. It doesn't even have to be successful because any race conditions caused by concurrent WFME, ResetEvent, SetEvent, or WFE calls on any of the events during a given WFME call is totally undefined behavior, which is exactly how WaitForMultipleObjects works. So simply calling the public WFE( 0 ) for each event after a successful wait satisfies the contract.

My derivative implementation uses the above approach and I can't come up with any test cases where it is broken. I would appreciate scrutiny because it only took me a day to come up with this surprisingly-simple approach when I ran into this issue with my own implementation. I'm appreciative of your work on this and since I derived mine from yours, I'm happy to provide my solution as stated above.

Answer 3 · 2021-06-27T20:02:21.000Z

@qwertymaster617 I know it's been some time, but this has been on the back of my mind. Your approach only works if we can take the following for granted:

It doesn't even have to be successful because any race conditions caused by concurrent WFME, ResetEvent, SetEvent, or WFE calls on any of the events during a given WFME call is totally undefined behavior, which is exactly how WaitForMultipleObjects works.

Otherwise performance will necessarily suffer because a global lock (or some equivalent) would be required to guarantee that a WFMO waiter awaiting n objects all of which are currently in a signalled state can atomically claim all n objects.

The problem is that I don't see anything that documents the race behavior that you describe. I believe the only undocumented/undefined behavior for WaitForMultipleObjectsEx is the order in which order multiple concurrent calls to some or all the same underlying objects will return.

I don't think there's any ambiguity when it comes to auto reset events: they must never allow more than one awaiting thread to wake for any given call to SetEvent(). If you have two threads A and B, and A is waiting on auto-reset events X, Y, Z with bWaitAll = true and B is waiting on Z then X, Y, and Z become available, A must either atomically acquire all the events or none of the events, but in your design, wakeup of A could begin by calling WFE(X, 0) and discarding the result, WFE(Y, 0) and discarding the result, then WFE(Z, 0) and discarding the result. But simultaneously a wakeup of B could be in progress and be scheduled between the waits on Y and Z, while Z is still available. B's WFE(Z, 0) would happen first and allow B to wake, then A's WFE(Z, 0) would occur, silently fail, and A would be allowed to wake, thus allowing two waiters to go through for one SetEvent(Z) call.

Answer 4 · 2021-07-06T02:51:48.000Z

You're right. Indeed, when I made the comment last year I had this exact scenario in mind.

However, when I was compiling this last year with MSVC on Win10 with Win32 versions of WFSO/WFMO/SetEvent each on their own thread with a single event, I was getting memory access violations on all 3 calls. It was strange. So I just figured that since the behavior was not specified at all on MSDN in this manner that it was unspecified or even undefined.

When I compile the same program today with MSVC 16.10.3, I no longer have this issue and can seemingly wait on events in arbitrary and ridiculous ways and it seems to work as one would expect, so maybe I just ran into unfortunate bugs at the time?

Anyway, the solution presented seems sound to me. I thought long and hard about it. I can't think of anything that breaks it, except for possible starvation because of the somewhat unbounded amount of spinning that can occur now on WaitForAll = true waits during high contention with Set/Reset calls.

Even so, this seems to be a pretty good solution to a hairy problem! I've finally had a chance this weekend to implement it and it works well. I also made a few comments on commit 2db23ff.

I'm a bit concerned about possible starvation and other issues where we single out one WFME waiter and wake them up when we're an auto-reset event. I understand fully why it's done, but possible over-reliance on priority inversion, the inability to reset events after being set, and inability to ever keep the event in the set state whenever there is at least one concurrent WFME call, but that's probably for a seperate discussion.

With this fix, I'm about 99% confident this is a spirit-and-letter API-compatible WaitForMultipleObjects clone.

Thanks!

Answer 5 · 2021-07-10T22:22:34.000Z

Note that I've rebased the atomic_wait_all branch over the latest master; I believe a lot of the performance issues should be mitigated, the only thing that remains is the potential for thread starvation under contention as the number of events being waited upon increases. I believe this is also true of Microsoft's WaitForMultipleObjectsEx() implementation.