Manifest file for disk-based WAL implementation

Question

Manifest file for disk-based WAL implementation

dracoooooo opened this issue 5 months ago · 3 comments

Describe This Problem

To implement a WAL based on the local disk, in addition to using segment files to record logs, it is also necessary to use another file to record the metadata of the WAL. This is because the current WAL Delete interface includes a tableId as a parameter, and logs for multiple tables are recorded in the same segment file. This means that it is not possible to simply mark all logs before a certain sequence number as deletable. Therefore, a manifast file is needed to maintain this information.

Proposal

Format

Using protobuf as the file format for WAL manifest:

syntax = "proto3";

message Manifest {
  map<string, uint64> latest_mark_deleted = 1;
}

The key in the map is <regionId>:<tableId>, and the value is the highest sequance number marked as deleted for this table in the WAL.

Not using the manifest file to record more information is to avoid updating this file during appending logs, thereby reducing I/O overhead.

Append Logs

Do not update the manifest file.

Read Logs

Use the manifest file to skip logs that have already been deleted.

Delete Logs

Update the values in the map, create a new manifest file, and overwrite the old file.

Record the maximum sequence number of all tables in each segment file in memory. When all the tables mark the deleted sequence number as greater than the maximum sequence number in the segment, delete this segment.

When an old segment is deleted, if a table’s log exists only in this old segment, then remove this table from the manifest’s map.

Potential Risks

If the number of tables is very large, the overhead of overwriting this manifest file each time could be significant.

Additional Context

No response

Answer 1 · 2024-08-10T09:17:16.000Z

The key in the map is <regionId>:<tableId>,

I think we can encode regionId in wal directory path, so the key could only contains tableId.

Use the manifest file to skip logs that have already been deleted.

How will you skip WAL files? Which strategy will you use?

Update the values in the map, create a new manifest file, and overwrite the old file.

This is the normal case, what if there are some partials error, such as overwrite failed, you need to document more details, pseudo code or sequence diagram may help.

Record the maximum sequence number of all tables in each segment file in memory

How will you recovery this info when server start up? do we need to iterate the whole WAL files?

Answer 2 · 2024-08-11T16:57:55.000Z

The key in the map is <regionId>:<tableId>,

I think we can encode regionId in wal directory path, so the key could only contains tableId.

Indeed.

Use the manifest file to skip logs that have already been deleted.

How will you skip WAL files? Which strategy will you use?

This manifest exists both in the file system and in memory. In memory, it is represented as a map. Since we record the min and max sequence numbers of each table in segments in memory, we can skip the segments that are not needed. While iterating through the necessary segments, we might encounter logs that have already been deleted. In such cases, we can skip them based on the information in the map.

Update the values in the map, create a new manifest file, and overwrite the old file.

This is the normal case, what if there are some partials error, such as overwrite failed, you need to document more details, pseudo code or sequence diagram may help.

The general steps for overwriting are to acquire the write lock for the manifest, create a new temporary file, write to this temporary file, use fsync to ensure the content has been written to the disk, and then use rename to replace the original file.

If an error occurs in the steps above, I don’t think it can be handled, and we would have to panic.

Record the maximum sequence number of all tables in each segment file in memory

How will you recovery this info when server start up? do we need to iterate the whole WAL files?

Yes. I think this is a trade-off to avoid writing manifest file during the WAL write operation.

Answer 3 · 2024-09-14T02:02:43.000Z

After some discussion, the manifest isn't a must, and introduce more burden.

The idea is that if we can delete unused segments in time, then when server restarts, we can reconstruct table seq from segment one after one.