ulimit nofiles reached on systems with large amounts of retention
hosom opened this issue · 7 comments
I have a system with 35 threads writing 2-6 MB index files and ~250 MB packet files.
This worked great up until we had more than 20 days of retention and then we hit the nofile ulimit since Stenographer keeps a handle open for every index. It would be pretty awesome to have some sort of system that enables large amounts of retention by opening/closing old index files as needed.
There's an experimental filecache branch uploaded now. Please let me know if it works, and if not please send me any logs you can from stenographer so I can debug.
Just an update...
We've been running stable since 5/4 on the filecache branch. I'm not experiencing any user-noticeable slowness at the moment. I have steno writing in 1 min segments and verbose logging enabled, so if anything goes wrong we'll be able to troubleshoot.
Excellent to hear!
Given that, I'll probably pull the filecache branch into master. Thanks for your work testing this out!
Queries still returning successfully. No errors in the logs when I reviewed them this morning.
Everything still stable and no errors to report.
We haven't restarted services since we installed the experimental branch and we are still able to retrieve packets without errors and have not had performance issues.
LRU cache has definitely solved this issue.