DataDog/glommio

RFC: A Send safe `DmaFile` to let the fd be shared across threads

vlovich opened this issue · 0 comments

Because all the APIs work on specific locations within the fd, the fd owned by DmaFile is in some way Send safe as there's no kernel state that could be mutated unsafely with the exception of closure (discussed below). I want a lighterweight mechanism to have other threads reference the FD without needing to dup.

I have a use-case where I sometimes return a fd + reference to a file region in response to a read request that can be lazily read on an arbitrary thread (or have sendfile invoked on it). Right now, I'm doing this by invoking dup on DmaFile to grab a reference. However, the problem with this is that aside from needing a synchronous syscall, I risk running into resource exhaustion with the number of open files (e.g. if I'm doing 2M reads/s, if even 10% hit this path that's 200k duplicate fds in flight) which is a lot of kernel memory for no real value.

I'd like some kind of lighterweight mechanism where I can share access to the fd without involving the kernel. There's two options that come to mind for me.

Option 1: Change the internals around to store Arc<RawFd> & introduce BorrowedDmaFile counterpart to OwnedDmaFile. Either BorrowedDmaFile could be converted into a DmaFile that doesn't auto-close / disallows invoking close or we introduce a new DmaFileRef class that BorrowDmaFile can be converted into (the former is a smaller refactor & less code duplication, the latter feels potentially cleaner).

Option 2: Implement FromRawFd for DmaFile similar to how GlommioFile does. Not sure it's actually possible since FromRawFd doesn't let us control whether or not the DmaFile auto closes on drop.

I feel like option 1 is cleaner since it ensures the file was opened by Glommio consistently rather than just taking over ownership of any old fd. It also provides better opportunity to inject some safety into this mechansim. The main question is about safety with respect to file closure. One possible technique would be to check strong_count when the owning DmaFile is dropped / has close invoked on it - if it's 1 then close normally, otherwise add the FD for a deferred closure once it's count drops to 0 (how the original thread notices it & risks of resource leaks because the owning thread finishes with other threads that hold non-owning references still running). There may be ways to deal with this, or we can just declare the borrow API unsafe and document that you have to be supremely careful about file closures.

I would also prefer if the borrow mechanism used a biased Rc mechanism (e.g. hybrid-rc) to defer paying the atomic increment cost until we know we're crossing a thread boundary, but I don't know the policy of taking on 3p dependencies.