Refactoring the `'dirty'` event, `dirtyPayload()`, and `XXXPayloadDirty()` methods logic & spec
huan opened this issue · 27 comments
There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton
Working PRs
The New Design
In this new design, we will use:
'dirty'
event (added via wechaty/grpc#79)dirtyPayload()
method (added via #114)XXXPayloadDirty()
method (removed via #114)
With the logic:
'dirty'
event will be emitted by the deepest puppet by calling thedirtyPayload()
method:PuppetServiceClient(PuppetServiceServer(PuppetAbstract))).dirtyPayload()
->PuppetAbstract.emit('dirty', payload)
- Every Puppet listen to
'dirty'
event with itsonDirty()
method, and invalid(remove) the cache of the payload with the specificid
andtype
(registeronDirty()
callback to thedirty
event will be done automatically by thePuppetAbstract
) XXXPayloadDirty()
should calldirtyPayload()
first, then wait the'dirty'
event. After receiving the'dirty'
event, it can return because the payload should be dirtied by the listener (which should be registered by the Puppet itself)- a timer should be added in case of the
'dirty'
event was lost, and throw an Error. (5 seconds in the current code)
- a timer should be added in case of the
New Puppet API: onDirty()
onDirty()
will beaddListener
to thedirty
event by thePuppetAbstract
- child puppets need to override it to clean their own payload cache/store (and do not forget to call
super.onDirty()
!)
Dirty a Payload Locally (in Puppet Provider)
How to dirty a payload (for puppet provider, locally):
puppet.XXXPayloadDirty(id)
future = await new Promise(resolve => puppet.once('dirty', resolve))
- check the dirty id/type match before resolve
- add a timer to reject the Promise if timeout
puppet.dirtyPayload('XXX', id)
setImmediate(() => puppet.emit('dirty', { type: 'XXX', id }))
<- add this task (emit) to the end of event loop queue
puppet.onDirty()
2. clean the cachepuppet.XXXPayloadDirty(id)
<- future resolvedawait new Promise(setImmediate)
<- wait current event loop task queue to be all executed
call 1, emit 2, listen 3, resolve 4
Dirty a Payload Remotely (with Puppet Service)
- 🖥️ Puppet Service
puppetService.XXXPayloadDirty(id)
future = await new Promise(resolve => puppet.once('dirty', resolve))
- check the dirty id/type match before resolve
- add a timer to reject the Promise if timeout
puppetService.dirtyPayload('XXX', id)
gRPC Service DirtyPayload
- ☁️ Puppet Server
puppetServer.dirtyPayload('XXX', id)
puppetServer.emit('dirty', { type: 'XXX', id })
puppetServer.onDirty()
listener: remove payload cachegRPC Service Event (Stream)
- 🖥️ Puppet Service
puppetClient.emit('dirty', { type: 'XXX', id })
puppetClient.onDirty()
listener: remove payload cachepuppetClient.XXXPayloadDirty(id)
future resolved
Steps Mapping
Locally | Remote Client | Remote Server |
---|---|---|
1 | 1. i | 2.i |
2 | 1. ii | 2.ii |
3 | 3. ii | 2.iii |
4 | 3. iii | 2.iv |
Related issues
Breaking Change
This change will not affect the end-users.
However, the puppet provider developer needs to follow this new logic when they are using the new version of Puppet Abstract Class.
The Puppet Service will be affected too, so we recommend paying attention to the Wechaty versions to be matching on both the server and the client.
CC @wechaty/puppet @wechaty/grpc
There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton
Start working on this Issue
Done.
- puppetServer.dirtyPayload('XXX', id)
- puppetServer.emit('dirty', { type: 'XXX', id })
- puppetServer.onDirty() listener: remove payload cache
- gRPC Service Event (Stream)
Currently in step1,the server remove payload cache.Why make it so complex?
Currently in step1, the server removes the payload cache. Why make it so complex?
This complexity is necessary because the Wechaty Puppet Service Client also holds a cache layer, which means that if the Server only removed the payload cache for itself, the Client will not get notified and still hold the outdated (dirty) payload and use it, causing the end user problems like the alias
, topic
can not be updated, etc.
The client call dirtyPayload method ,which means that the client can remove the cache.
dirtyPayload is request-reponse .If the server remove some payload cache,the server will emit dirty
event to client.
If the server removes some payload cache, the server will emit a dirty event to the client.
Yes, that's all the server needs to do: when it dirty any payload, emitting a dirty
event will be enough.
Because the client can not know when they should remove a cache from the server, they must be notified by the server, with a dirty
event. After the client received the dirty
event, then they will be able to clean the cache, and then request the payload again from the server to get the latest data.
That's all our current design is for, thank you very much for pointing this out.
客户端请求payloadDirty的时候,就是客户端希望服务器端删除缓存,这个时候客户端请求成功之后就可以删除本地缓存了啊, 难道还需要服务器再发个event给客户端?是不是有点多此一举?
如果是服务器端自己根据逻辑判断需要删除缓存,那么服务器端会发送event给客户端的,我觉得0.x的设计是对的啊。
Thanks for raising this discussion!
What you said is only partially true in the paimon use case because paimon is the most simple architecture in the wechaty ecosystem.
In a complicated architecture like WXWork or Pad
Local gateway:
Puppet WXWork -> Puppet Service Server -> gRPC -> Puppet Service Client -> Wechaty
This design is necessary for dealing with the dirty payloads to be correct.
puppetServer.dirtyPayload('XXX', id)
puppetServer.emit('dirty', { type: 'XXX', id })
puppetServer.onDirty() listener: remove payload cache
gRPC Service Event (Stream)
能分别介绍一下,在Grpc服务端干啥工作? 第三步和第四步分别是啥?Grpc里面没有onDirty接口啊?第四步是啥意思?这个改动我没看明白,尤其是对于Grpc服务端正确的做法。
谢谢!
Hi @zpaimon
Thanks for asking!
This issue is written for the standard Wechaty Puppet Provider, so it has some Wechaty Puppet Abstract API implementation details, which might be confusing you because the Paimon is a pure gRPC service.
I think from Paimon's perspective, what you need to care about is only the below two gRPC methods (and safely ignore all the 第三步和第四步 & Grpc里面没有onDirty接口 & 第四步):
rpc Event()
rpc DirtyPayload()
You can read the detail of the above methods from the proto buffer definition at Wechaty gRPC Repo
So here's the answer to your question: "在Grpc服务端干啥工作": you need to implement the below two algorithms:
1. send EventResponse()
to the rpc Event()
stream
When the Paimon has any payload updated (like contact alias change, room topic change, room member change, etc.), it needs to send a new EventResponse()
to the grpc Event()
stream to the client, with the EventType
is EVENT_TYPE_DIRTY
:
message EventResponse {
EventType type = 1;
// TODO: Huan(202002) consider to use a PB Map?
string payload = 2; // JSON.stringify({ ... })
}
Here's the EventResponse link and EventType link for the above definition.
2. react rpc DirtyPayload()
call
This design is for helping the client to force re-sync the data from the server.
When Paimon received an rpc DirtyPayload()
call, it needs to:
- delete the server cache of the id with the payload type first, then
- send an
EventResponse
to therpc Event()
stream with the sametype
andid
to notify the client that the server has dirty-ed the payload.
I hope my explanation can help you start understanding this design. Please feel free to let me know if you have any more questions.
Thank you very much!
send an EventResponse to the rpc Event() stream with the same type and id to notify the client that the server has dirty-ed the payload.
I think paimon only has't the above logic and other things is ok!
Is the design compatible with 0.x API ?
If the client is 0.x and grpc server is 1.x,dead loop will happen.
Client(DirtyPayload)->Server(dirtyPayload and send Dirty event to client)->Client(DirtyPayload)........
Dead loop !
You are welcome.
Glad to know that you made it clear that what the Paimon need to follow, and please feel free to let me know if you have other questions.
Is the design compatible with 0.x API ?
If the client is 0.x and grpc server is 1.x,dead loop will happen.
Client(DirtyPayload)->Server(dirtyPayload and send Dirty event to client)->Client(DirtyPayload)........
Dead loop !
Dead loop !
Good catch: no, this design is not compatible with 0.x client 😢
It's a mistake because we have missed this problem when we were designing this v1 new logic.
I believe this is also related to
The solution for this problem might be:
- Use a workaround logic in the server, to prevent to emit the same dirty event in a period of time, as a workaround. I think it will be very easy to implement this by using a RxJS throttle operator.
- Make sure the user are upgrading to the Wechaty v1.x before they are using a v1.x puppet service.
Please feel free to use any other workarounds if you can find a better one.
And I think this problem should be posted in the website if the user need to upgrade to wechaty v1.x go use the new paimon.
So i think the GRPC protocol need to keep it simple logic.I suggest that the GRPC server don't send Dirty
event to client when client call dirtyPayload which is request-reponse mode.
So I think the GRPC protocol needs to keep it simple logic. I suggest that the GRPC server doesn't send the Dirty event to the client when the client call dirtyPayload which is request-reponse mode.
If the Wechaty user is using the wechaty-puppet-service
direct connect to the Paimon service, then your suggestion would work.
But unfortunately, we can not follow your suggestion because what you suggested can not cover all the use cases.
When we have any middle layers between the Wechaty and the Puppet Provider, for example:
Wechaty -> Wechaty Puppet Service Client -> gRPC -> Wechaty Puppet Service Server -> Wechatyu Puppet Provider
Then we need a dirty
event to be propagated through the above pipeline, to make sure all the payload cache in the path has been dropped, because the request-response
model can only make sure the first and last node can be synced.
In this case, we need to emit the dirty
payload and propagate it through the pipeline, so that all the layers will be updated correctly.
Wechaty -> Wechaty Puppet Service Client -> gRPC -> Wechaty Puppet Service Server -> Wechatyu Puppet Provider
In the above pipeline,gRPC can await the Wechaty Puppet Service Server
to clear cache, then response(dirtyPayload response) to Wechaty Puppet Service Client
that the payload cache is cleared.
Unfortunately, we can not only await the Wechaty Puppet Service Server
to clear the cache because this logic only clears the cache from one node, which is the server itself.
The Wechaty needs to make sure all the nodes in the pipeline have cleared the cache before it can continue reloading the payload again, to make sure there's no outdated cache in the pipeline so that it can get the newest data correctly.
That's the reason why a dirty
is needed and should be propagated through the pipeline.
Each node in the pipeline will listen to the dirty
event and do two jobs:
- clear the cache
- propagate the
dirty
event to the upper layer
After the dirty
event has been propagated to the latest node (which is the Wechaty layer), then the Wechaty can know that all the outdated cache has been cleared.
I hope I have explained this problem clearly. Please think more about a complex architecture other than the Paimon case, and that will greatly help you to understand this dirty
event design.
In other side,the design maybe trigger wechat risk control if infinite request for the fresh payload(Contact/Room)。
I think I might find the infinite loop problem you talked about in wechaty/puppet-wechat#177. I think it's a puppet issue so I post it here.
Let's take a look at this example and see how it goes.
bot.Contact.find({name: xxx}).phone(['+8612345678'])
Wechaty ContactMixin phone()
await this.wechaty.puppet.contactPhone(this.id, phoneList)
await this.wechaty.puppet.contactPayloadDirty(this.id)
Wechaty Puppet ContactMixin contactPayloadDirty()
await this.__dirtyPayloadAwait(
PayloadType.Contact,
id,
)
Puppet Server will emit a dirty event and resolve the above await. Everything seems alright so far, but here:
Wechaty PuppetMixin __setupPuppetEvents()
puppet.on('dirty', async ({ payloadType, payloadId }) => {
try {
switch (payloadType) {
case PUPPET.types.Payload.RoomMember:
case PUPPET.types.Payload.Contact:
await (await this.Contact.find({ id: payloadId }))?.sync()
break
...
Wechaty Contact sync()
await this.ready(true)
Wechaty Contact ready()
if (forceSync) {
await this.wechaty.puppet.contactPayloadDirty(this.id)
}
Here we go, we go back to Wechaty Puppet ContactMixin contactPayloadDirty()
Good catch!
It seems that the dead loop bug had been located.
Will investigate on it after back to my computer, and please feel free to create a PR to send your fix.
Thank you very much!
There are many ways to fix this, so we should come to some consensus first.
In my opinion, it's a bit odd for puppet-implementation to emit a dirty event when the dirty comes from the puppet-service. This dirty is completely different from the dirty event trigged by puppet-implementation. We can either remove it, or add a new event type for it. (like 'dirty-response').
If to separate these two kind of dirty event is not acceptable, we can also add a pool in onDirty handler, so that all dirty events started from __dirtyPayloadAwait will be dismissed.
Maybe we should move this discuss into a new issue since this one is closed long ago.
@huan is It means that the 'dirty' event depend on Implementation of the puppet?
I use puppet-wxwork and create a room ,when it is created,here is the error,then I add something like
await bot.puppet.roomPayloadDirty(room.id)
await bot.puppet.roomMemberPayloadDirty(room.id)
The error still exists,Is my calling method is not right?
14:06:14 INFO Bot createDingRoom() new room created: WechatifiedRoomImpl
14:06:19 WARN PuppetCacheMixin __dirtyPayloadAwait() timeout.
The `dirty` event should be received but no one found.
Learn more from https://github.com/wechaty/puppet/issues/158
payloadType: Room(3)
payloadId: R:10742655366665696
error: Timeout after 5000 ms
stack: GError: Timeout after 5000 ms
at Function.from (file:///Users/choogoo/WeChatProjects/groupadmin/bot/node_modules/gerror/src/gerror/gerror.ts:69:17)
at Timeout._onTimeout (file:///Users/choogoo/WeChatProjects/groupadmin/bot/node_modules/gerror/src/timeout-promise/timeout-promise.ts:26:36)
at listOnTimeout (node:internal/timers:557:17)
at processTimers (node:internal/timers:500:7)
14:06:22 WARN PuppetCacheMixin __dirtyPayloadAwait() timeout.
The `dirty` event should be received but no one found.
Learn more from https://github.com/wechaty/puppet/issues/158
payloadType: Room(3)
payloadId: R:10742655366665696
error: Timeout after 5000 ms
stack: GError: Timeout after 5000 ms
at Function.from (file:///Users/choogoo/WeChatProjects/groupadmin/bot/node_modules/gerror/src/gerror/gerror.ts:69:17)
at Timeout._onTimeout (file:///Users/choogoo/WeChatProjects/groupadmin/bot/node_modules/gerror/src/timeout-promise/timeout-promise.ts:26:36)
at listOnTimeout (node:internal/timers:557:17)
at processTimers (node:internal/timers:500:7)
14:06:27 WARN PuppetCacheMixin __dirtyPayloadAwait() timeout.
The `dirty` event should be received but no one found.
Learn more from https://github.com/wechaty/puppet/issues/158
payloadType: RoomMember(4)
payloadId: R:10742655366665696
error: Timeout after 5000 ms
stack: GError: Timeout after 5000 ms
at Function.from (file:///Users/choogoo/WeChatProjects/groupadmin/bot/node_modules/gerror/src/gerror/gerror.ts:69:17)
at Timeout._onTimeout (file:///Users/choogoo/WeChatProjects/groupadmin/bot/node_modules/gerror/src/timeout-promise/timeout-promise.ts:26:36)
at listOnTimeout (node:internal/timers:557:17)
at processTimers (node:internal/timers:500:7)
14:06:27 INFO Room room-7881302781913704 topic changed from room-7881302781913704 to room-7881302781913704 by 瓦力
Yes, it depends on the implementation.
The new Wechaty v1 will expect a dirty event in this case.
BTW: I think when using Wechaty, we should avoid calling the puppet directly as possible as we can.