Adapt replication batch size dynamically
volkerstampa opened this issue · 1 comments
Currently the number of events that are replicated through a single akka-message is fix (eventuate.log.write-batch-size
) (if enough events wait for replication). If the binary representation of the resulting message exceeds the akka-remote frame size (akka.remote.netty.tcp.maximum-frame-size
) the message is dropped at the source and the replication essentially stops as each retry results in the same error. A problem like this can only be fixed by re-configuring ether eventuate.log.write-batch-size
or akka.remote.netty.tcp.maximum-frame-size
and restarting the application (or at least the actor-system).
To avoid a restart in these situations eventuate could either adapt the replication batch size dynamically (for example if n (n>=1) replication requests did not receive a response, the batch size is reduced incrementally by factor f1 (f < 1) until a replication response is received. After that the batch-size can be increased again either incrementally (by factor f2) or at once to its original size.
Alternatively the batch-size is no longer measured in number of events but rather in number of bytes.
Some thoughts: an implementation that can determine the total number of bytes on sender-side would probably result in the most efficient replication algorithm (without changing too much).
The receiver would then probably have to request either
- "at-most-n" events (allowing the receiver to apply some primitive form of back-pressure), or
- "as many as fit in the batch" events (to achieve a large throughput)
The feasibility of 2. probably depends on:
- if it is possible to determine an upper bound for the protocol overhead in which the actual batch is embedded
- if we can determine the serialized size of the events batch. AFAICR ad hoc the serialization is done by Akka so it is not clear to me if this is possible.
I'd be happy to help in both discussion and implementation ;-)