Events out of order during distributed deployment recovery

Question

Events out of order during distributed deployment recovery

suhothayan opened this issue 5 years ago · 2 comments

Description:
Events are out of order during distributed deployment recovery (replaying data from NATS Streaming Server).

Better if we know the reason why, and fix this if it will not introduce performance issues.

2020-01-03 15:30:29 INFO  LoggerService:42 - {event={name=Cake, amount=380.0}}
2020-01-03 15:30:30 INFO  LoggerService:42 - {event={name=Cake, amount=400.0}}
2020-01-03 15:30:31 INFO  LoggerService:42 - {event={name=Cake, amount=420.0}}
2020-01-03 15:30:31 INFO  LoggerService:42 - {event={name=Cake, amount=440.0}}
2020-01-03 15:30:45 INFO  LoggerService:42 - {event={name=Cake, amount=460.0}}
2020-01-03 15:30:46 INFO  LoggerService:42 - {event={name=Cake, amount=480.0}}
2020-01-03 15:30:48 INFO  LoggerService:42 - {event={name=Cake, amount=500.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=380.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=400.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=440.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=480.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=420.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=500.0}}
2020-01-03 15:30:55 INFO  LoggerService:42 - {event={name=Cake, amount=460.0}}
2020-01-03 15:31:51 INFO  LoggerService:42 - {event={name=Cake, amount=520.0}}
2020-01-03 15:32:09 INFO  LoggerService:42 - {event={name=Cake, amount=540.0}}
2020-01-03 15:32:10 INFO  LoggerService:42 - {event={name=Cake, amount=560.0}}

Answer 1 · 2020-01-06T06:15:33.000Z

I tried to reproduce the scenario with a testcase as in siddhi-io/siddhi-io-nats#36. But the events are retrieved in the correct order at the Source. Furthermore, I encountered a bug related to duplicating an event during persisting and restoring. It was fixed in the above PR itself.

Tried the same by publishing through a Nats Sink instead of NatsClient. Couldn't reproduce the out-of-order scenario.

Will try this on distributed deployment and update the thread

Answer 2 · 2020-01-06T06:43:30.000Z

Steps to reproduce. Setup the distribution deployment with file based persistence. Have a counting query without window. Send some events and see how the count is increasing. On one terminal kill the stateful pod wile continuing to publish the messages to the non stateful pod from the other terminal. You should be able to see the numbers printed out of order.

On Mon, Jan 6, 2020 at 11:45, Chiran Fernando ***@***.***> wrote: I tried to reproduce the scenario with a testcase as in siddhi-io/siddhi-io-nats#36 <siddhi-io/siddhi-io-nats#36>. But the events are retrieved in the correct order at the Source. Furthermore, I encountered a bug related to duplicating an event during persisting and restoring. It was fixed in the above PR itself. Tried the same by publishing through a Nats Sink instead of NatsClient. Couldn't reproduce the out-of-order scenario. Will try this on distributed deployment and update the thread — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#817?email_source=notifications&email_token=AA44D6ARCRACMMVH7N7GI3DQ4LEANA5CNFSM4KCMVEMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIEQ36A#issuecomment-571018744>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA44D6DU5L7ARNRXA7WMXSTQ4LEANANCNFSM4KCMVEMA> .

-- *S. Suhothayan* | Senior Director | WSO2 Inc. <https://wso2.com/> (m) (+94) 779 756 757 | (e) suho@wso2.com | (t) @suhothayan <https://twitter.com/suhothayan> GET INTEGRATION AGILE Integration Agility for Digitally Driven Business