Staged writes can end up in invalid state
vasil-pashov opened this issue · 0 comments
Describe the bug
Currently the flow for compact_incomplete/sort_merge is
- Read APPEND_DATA keys
- Compact the result into TABLE_DATA
- Create INDEX_DATA
- Remove all APPEND_DATA keys
- Create new version key
- Update ref key
If an the program dies right after APPEND_DATA keys are deleted but before the version key is created the symbol is in incomplete/unreadable state. The symbol won't appear in the list of incomplete symbols (because there are no append data keys) it won't be in symbol list as well, but there will be orphaned table data and index keys.
Steps/Code to Reproduce
Start compaction and kill the process right after sort_merge_impl or compact_incomplete_impl.
Expected Results
The symbol should be either in the list of incomplete symbols or in the symbol list.
To achieve this the sequence of operations must be changed. So that the append data keys are deleted after the ref key is updated.
OS, Python Version and ArcticDB Version
all
Backend storage used
No response
Additional Context
No response