AutoMQ/automq

[Enhancement] WriteAheadLog with sequentially callback

superhx opened this issue · 4 comments

Problem

Currently, BlockWALService persists data blocks in parallel, responding directly to the upper layer with success as soon as any data block is persisted, even if the previous data block has not completed persistence. Although this method provides "better" write latency, it shifts the responsibility of ensuring sequentiality to the upper layer, making the upper layer logic more complex.

Expectation

Implement a SequetialBlockWALService:

  • Provide the semantics of sequentially callback (the underlying can still be concurrent writing);
  • Through optimization of locks and models, the throughput and latency can match(or even better than) that of the original BlockWALService on Aliyun ESSD PL1 20GB 120MiB/s 2800 IOPS with write concurrency 8/16/32/64;
  • The format of the WAL needs to remain consistent with the previous one, and support the recovery logic with data holes of BlockWALService so that it can be directly replaced in the future;

@superhx Hey, after careful consideration of the current write model on my end, I've found that to retain the current parallel write model while implementing sequential callback semantics on top,
it might be more appropriate to process it at the stage after the block has been written.
image

After the block is written to the WAL, consider serializing the process of writing the record to the logCache to reduce the complexity at the upper layer.

  1. Create a single-threaded callbackExecutor.
  2. When the I/O thread completes the write, trigger the callback operation of the Request, hand over the handleAppendCallback to the callbackExecutor to complete, and perform the sequential callback within the callbackExecutor.

for example:
com.automq.stream.s3.S3Storage#append0
image

This approach can avoid concurrency control of the callbackSequencer, while also decoupling business-related operations from I/O operations: since the current I/O thread is not only responsible for writing the block but also for handling the request callback processing chain;

Potential issues: Considering under a large number of requests, processing callbacks may lead to significant processing pressure, but for now, they are all pure memory operations. If there is concern about delays in ack responses, then it is actually possible to consider designing the callbackExecutor as a multi-threaded model, routing to threads in the object thread pool based on streamId,thereby ensuring the orderliness maintained by the stream.
Alternatively, consider setting up an ack response thread pool, dedicated to handling ack requests upwards.

@CLFutureX Hi, I think you might have misunderstood @superhx's idea. In his description,

it shifts the responsibility of ensuring sequentiality to the upper layer, making the upper layer logic more complex

the "upper layer" mentioned here refers to S3Storage rather than the caller of S3Storage.

Specifically, you can see that due to the current non-sequentially callback, we have added a lot of complex logic in S3Storage, such as WALCallbackSequencer and WALConfirmOffsetCalculator. This has made the S3Storage code overly complex and difficult to maintain.

What we want to do now is to thoroughly refactor BlockWALService to make it callback sequentially, thereby reducing the complexity of S3Storage.

@CLFutureX Hi, I think you might have misunderstood @superhx's idea. In his description,

it shifts the responsibility of ensuring sequentiality to the upper layer, making the upper layer logic more complex

the "upper layer" mentioned here refers to S3Storage rather than the caller of S3Storage.

Specifically, you can see that due to the current non-sequentially callback, we have added a lot of complex logic in S3Storage, such as WALCallbackSequencer and WALConfirmOffsetCalculator. This has made the S3Storage code overly complex and difficult to maintain.

What we want to do now is to thoroughly refactor BlockWALService to make it callback sequentially, thereby reducing the complexity of S3Storage.

Ye, this will be a big project