Potential issue with "controlled ids" concept in y-websocket server implementation
mifopen opened this issue · 4 comments
I'm not sure that understand the concept fully, that's why I'm using "potential" in the title.
Could you please elaborate on the purpose of this piece:
https://github.com/yjs/y-websocket/blob/master/bin/utils.js#L111-L115
const connControlledIDs = /** @type {Set<number>} */ (this.conns.get(conn))
if (connControlledIDs !== undefined) {
added.forEach(clientID => { connControlledIDs.add(clientID) })
removed.forEach(clientID => { connControlledIDs.delete(clientID) })
}
Later this list is used for removing the state of controlled ids of the connection on closing it.
I see why we need it. But what if during the connection lifecycle it sends a removal event for some other client awareness because of the timeout? It means that other clients will be in the "controlled ids" list and their state will be purged on the server together with current client closure.
For example,
- There are clients X and Y
- Client X call removeAwarenessStates([Y]) because of no updates from Y in outdatedTimeout ms
- emit('update', {removed: [Y]})
- X sends [{clientID: Y, state: null}]
- Server handles {removed: [Y]}
- Server adds Y to X's "controlled ids" list
- X closes the connection
- Server removes X and Y awareness states and notifies others about it
- Client Y receives the update that tells that it was removed but we have special handling of this case here (https://github.com/yjs/y-protocols/blob/master/awareness.js#L259) so it sends the clock update and the server reinstates the client Y notifying everyone again
Is it a correct sequence of steps?
@dmonad Don't know why the issue is marked with wontfix
automatically. Hope it doesn't mean that it goes directly to a trash bin 😁
It could be a problem to distinguish between "your own" ids and "someone else's" ids in the awareness update handler. Could even require protocol changes. But I'm just wondering why we even need to distribute this kind of "I marked remote client as absent because of the timeout on my side" updates. If the remote client is really absent, then every other awareness instance will be able to mark it as so by itself which should be enough, right?
Hey! Is there any way we can help with it?