martinsumner/leveled

Empty Journal - File in Ledger

Closed this issue · 2 comments

An interesting situation can occur if:

  • the file system is wiped while the store is running
  • the store is then closed

In this situation, as part of the closing, the penciller will save the information in its memory to disk as part of its last breath. This is done to speed up start-up time (so that information doesn't need to be reloaded from the journal at startup).

This means that the ledger will have a single SST file - containing a range of recent SQN (say 9000 - 10000). There will be no other files on disk.

When the store is then opened, the penciller is opened and the file of recent heads is discovered. It then tries to load from the Journal from this point forward - and the Journal is empty, so this is the only data in the store. However, the Inker still thinks it is at SQN 0 - as it started empty.

What happens next is that the store will respond to HEAD requests with some responses for which it cant provide GET responses. In a riak context, this anomaly is detected (this depends on check_presence being used), and the data will be repaired through read repair.

However, repaired data, and any new data will be assigned a SQN < 10000. When this information is loaded into the Penciller - the Penciller crashes, as the SQN number is not below the expected maximum. On restart, the net effect is to "lose" this data. The penciller will load information from the Journal from SQN 10000, and everything below 10000 will not be retrievable.

Within Riak - eventually everything gets repaired. But the interim is an ugly series of repairs, crashes, re-repairs more crashes etc.

How to resolve this ugliness:

  • Stop the last breath save. This stops the situation where this is the root cause, but there may be other situations where information in the Journal gets corrupted/lost.

  • don't load information files in the Ledger where the max SQN is beyond the Inker's view of the SQN at startup. This stops the phantom availability of data that may delay repair.

  • Make the Inker SQN at startup the max of its own view, and the Ledger view.

Current view is the third solution is the best one. Just avoid the risk of writing data that will be lost after recover, but still try and have as much information as you can on recovery - even if that information is a head with no body (because check_presence can resolve this).