Feature: realtime Secondary TryCatchUpWithPrimary
rockeet opened this issue · 0 comments
We have written a fuse file system which intentional blocks EOF read on writing files. Thus when primary instance writing WAL & manifest files, EOF read on secondary instance will be blocked, once primary instance write(or close) WAL & manifest, blocking read on secondary will get returned with the new written data at once -- our bench shown the latency is ~100us on commodity ethernet(with NFS + O_DIRECT).
With this feature, secondary instance need not sleep between TryCatchUpWithPrimary
loops, but there are issues for current implementation:
- long
lock(mutex_)
inTryCatchUpWithPrimary
- catch up WAL depends on catch up manifest
If the long lock on mutex_
can be replace with short lock, and remove dependency then catch up WAL & manifest in 2 different threads, it should be a perfect solution.
I'm not familiar with the internal complexity of secondary instance and can not contribute a PR, I want a solution for this feature.
Another question: PurgeObsoleteFiles
may not needed in TryCatchUpWithPrimary
, because files are shared with primary, it should be purged by primary(with a retention period).