paritytech/polkadot-sdk

[subsystem] improve parallelization by consuming messages inbetween overseer signals in parallel

Opened this issue · 4 comments

The futures API we could consider to use more in order to avoid the stalling subsystems:

https://docs.rs/futures/0.3.17/futures/prelude/stream/trait.StreamExt.html#method.buffer_unordered

Currently we use a bunch of for msg in source.await() { match msg {} } style patterns which is rather prohibitive of any parallelization.
There are a few perils of Signals vs regular Messages (again) that need to be taken into account. The sequencing within the overseer is not sufficient. This might require some additional logic around BufferUnordered or a custom impl that is Signal/Message-semantic aware.

This will have two effects:

  • individual messages can no longer block the whole execution flow
  • easy tuning knobs for parallelization per subsystem
  • maintain per subsystem backpressure

There are certain instances where we actually want specific messages to block execution flow. A prime example is handling overseer signals. For instance, paritytech/polkadot#3924 only works because we ensure that the dispute coordinator handles collecting backing votes and candidates before we handle a message from a corresponding approval task that issues an IssueLocalStatement message. The overseer model is designed for synchronization around the advancement of the relay chain, and we should take care to preserve that property where needed.

As a design choice, I would advocate for non-parallel as a default and we should be explicit about which work we do in the background. There are likely diminishing returns for processing more and more messages in parallel, and it may even reduce performance at some point due to increased reliance on shared state.

The idea is not so much about the fact that compute power is limited or speeding things up, but rather being bottlenecked on a single subsystem that does some (relatively speaking) heavier computations and hence become a bottleneck when doing serial processing as we currently do. The above approach is particularly interesting for those.

I fully agree to stating things explicitly.

If we can guarantee on synchronization on Overseer::Signal, keeping the total ordering (rather use buffer() than buffer_unordered(), except for the heavy processing part, this has little chance of failure. This would imply splitting subsystem state updates into pre and post stage of doing the heavy and/or blocking processing part which indeed would require some additional investigations.

For a few individual subsystems this could play out very well, for others, less so.

Candidates to consider are those subsystems that are:

  • blocking on an oneshot::Receiver which corresponding oneshot::Sender was sent to another subsystem
  • dealing with erasure code recovery or signature verification operations

Re-prioritizing in face of paritytech/polkadot#4911