Partition removal (TTS support)
danthegoodman1 opened this issue · 0 comments
While TTL's can kind of work with custom merge queries and filtering in end queries (like how clickhouse works), it's not an ideal DX.
The way we can handle TTLs (and other forms of partition cleanup, e.g. deleting a user id if in partition) is by having a generic partition_cleaner
function defined by the user that is given a list of unique partitions. That will then return a new list of partitions to remove. We then will delete all data files with those partitions (from the initial list, so something added during that time will be left) and write a new log file with tombstones for those data parts (and any log files that had those data parts).
This shares the same lock as merging.
Basic implementation is:
- Read the state and build a list of unique partitions
- Get the list of partitions to delete back from the user
- "merge" those log files by creating a new one with any file marker in the removed partitions with a tombstone, and tombstones for the involved log files
Then tombstone cleanup will delete them later. This is purely a log-merge, no data parts are touched.
Then need to write docs on how this works in both the readme and arch.