Handle Out-Of-Disk gracefully
Opened this issue ยท 8 comments
Is your feature request related to a problem? Please describe.
Currently, there are situations, when Qdrant service can crush if it is not enough disk space to perform update operation.
This is sub-optimal behavior, as it should still be possible to respond to the search requests in this case.
Describe the solution you'd like
Add improve the handling of the situation, where qdrant faces out-of-disk problem. Instead of crashing, it should answer 500 to the user and still be able to process incoming search requests.
Describe alternatives you've considered
Block requests if the disk usage is above some threshold. This would require configuration of the arbitrary threshold and overall less desirable.
Additional context
We prepared an automated test scenario - #4105
Solution of this issue should include a PR into test/low-disk-tests
branch, which makes the OOD test pass.
/bounty $250
๐ $250 bounty โข Qdrant
Steps to solve:
- Start working: Comment
/attempt #4108
with your implementation plan - Submit work: Create a pull request including
/claim #4108
in the PR body to claim the bounty - Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts
Additional opportunities:
- ๐ด Livestream on Algora TV while solving this bounty & earn $200 upon merge! Comment
/livestream
once live
Thank you for contributing to qdrant/qdrant!
Add a bounty โข Share on socials
Attempt | Started (GMT+0) | Solution |
---|---|---|
๐ข @Rutik7066 | Apr 24, 2024, 3:27:47 PM | WIP |
๐ข @kemkemG0 | May 3, 2024, 9:09:45 AM | #4165 |
I think RocksDB employs a similar approach as you propsed, such as this if (free_space < reserved_disk_buffer_)...
https://github.com/facebook/rocksdb/blob/ed01babd07ab23788f563e78c234c01d247c09b9/file/sst_file_manager_impl.cc#L272-L291
Additionally, it appears that RocksDB allows users to set the disk buffer size as cf.options.write_buffer_size
.
https://github.com/facebook/rocksdb/blob/ed01babd07ab23788f563e78c234c01d247c09b9/db/db_impl/db_impl_open.cc#L2029-L2034
https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide#flushing-options
I wonder if we can use wal_capacity_mb
for this, but I'm not sure if write_buffer_size
is equivalent to WAL size.
Either way, since RocksDB employs a strategy to maintain a maximum disk usage threshold, I think we should adopt a similar approach. I would love to proceed with this strategy. What do you think?
๐ก @kemkemG0 submitted a pull request that claims the bounty. You can visit your bounty board to reward.
๐๐ @kemkemG0 has been awarded $250! ๐๐