RBMHTechnology/eventuate

Using EventsourceProcessors in combination with Replication may lead to loss of processed events

volkerstampa opened this issue · 1 comments

The implementation of an EventsourcedProcessor is based on the replication protocol and thus it applies the causality filter to processed events, however the causality filter is not based on the final vector timestamp the processed event should get, but on the vector timestamp of the source event. In combination with replication of source and target log of the processor this may lead to losing (i.e. filtering out) the processed event.

For example:
Given two location A and B with two replicated logs L1 and L2 and each with a processor processing from L1 to L2 only locally emitted events (to avoid duplicate processing of events). Vector timestamps have the four process ids: A_L1, A_L2, B_L1, B_L2 and are denoted in the form (1,2,3,4)

  • A emits e1 (1,0,0,0) to L1
  • B replicates e1 to L1
  • B emits e2 (1,0,2,0) to L1
  • B processes e2 to L2: e3 (1,0,2,1)
  • A replicates e3 to L2
  • A processes e1 to L2: e4 (1,2,0,0)

Problem is that e4 is initially created with timestamp of its source event (1,0,0,0) then filtered by the causality filter according to the version vector of the target log L2 which is (1,0,2,1) in this example and thus filtered out before its timestamp is finalized to (1,2,0,0) (which would not be filtered out)

At the moment the described scenario (two processors in two locations processing from and to the same replicated log is not supported.

The causality filter for processors makes a lot of sense to ensure idempotence even if the processor's progress cannot be written with the emitted events (due to an event-log failure for example).