basho/bitcask

Folds miss values due to race with deletes

Closed this issue · 5 comments

Folds work over a snapshot of the data. When a folder starts, it fetches the current epoch number, which will be used to resolve which version of an object it is allowed to use. This multiple version mechanism was introduced to allow multiple folders to work over the same data. Before this, only one snapshot of the data was available. Monotonically increasing epochs were introduced recently to replace the racy timestamp based mechanism we had before. Each operation generates a new epoch which is saved on each modified entry. So an entry in the keydir may have different versions for different epochs, each pointing to potentially different files.

Merges delete their input files once they complete. The race here is that a fold fetches an epoch, then proceeds to open files in the Bitcask directory. See

bitcask/src/bitcask.erl

Lines 386 to 389 in 4a048be

CurrentEpoch = bitcask_nifs:keydir_get_epoch(State#bc_state.keydir),
PendingEpoch = pending_epoch(State#bc_state.keydir),
FoldEpoch = min(CurrentEpoch, PendingEpoch),
case open_fold_files(State#bc_state.dirname, ?OPEN_FOLD_RETRIES) of
. It will keep retrying until it can successfully open all files listed in the directory. In between grabbing the epoch and opening the files, some of the files needed for the snapshot might have been deleted. Even if new values exist in files created by the merges, they will have higher epochs and not be present in the fold's snapshot. So the fold will miss those entries. This is demonstrated in this branch, where the code has been modified to artificially delay the file opening, allowing a merge to delete some files. The fold misses all values: https://github.com/basho/bitcask/compare/fold-open-delete-race-demo.

This is what I was trying to explain in this comment and this other one.

@engelsanchez any estimate on effort to fix?

I'm in the process of testing a fix for this. I opened the issue a bit late to make sure that my conversations with Scott in #170 and #156 didn't get lost in the shuffle. But I have been working on this and talking to Scott the last couple of days. There is a good chance the PR will be out today.

One question remains open: Now if someone deletes a file from under
Bitcask, the fold will fail indefinitely. Before, it would eventually
proceed and return whatever data is still left. Obviously we'll need to
let it do something similar, falling back to returning partial data.

I'll take a peek at this. Right now, my feeling is that the open Bitcask should be abandoned. When reopened, the 'cask won't know that the missing file is missing, and its startup will proceed without further problems. We'll have to rely on other parts of Riak (or similar) how to repair the damage.

My first attempt at fixing this is here #175

#175 is merged & closed -> closing.