oxidecomputer/propolis

IO ordering with Propolis file backend

Closed this issue · 3 comments

The Propolis FileBackend spins up 32 worker threads, which await data using block::backend::Attachment::block_for_req.

If the guest sends a write immediately followed by a flush, I'm not sure that our current implementation guarantees correct ordering of operations. Considering the following sequence:

  • Guest sends a write
  • Guest sends a flush
  • Worker thread 0 gets Operation::Write(..) from the queue
  • Worker thread 1 gets Operation::Flush from the queue
  • At this point, there's no coordination between worker threads, so it's possible for scheduling to lead to...
    • Worker thread 1 calls self.fp.sync_data(..), performing the flush
    • Worker thread 0 then performs the write, but the data is not durably on disk!

That's expected behaviour from an NVMe perspective at least. A flush command only applies to commands completed prior to the submission of the flush.

Yeah, it's on the guest to look for IO completions for its writes before issuing a flush to make them consistent

Got it, thanks!