How much memory does this use?
Closed this issue · 2 comments
I'm wondering about memory use. The only mention is in the README, which says "without needing temporary files or excessive memory.". I'm assuming that this is all streaming and it's only holding in-flight chunks in memory - but I haven't been able to try it out and verify this yet.
So, what is actually being held in memory? Is it just in-flight chunks, or the whole current file, or what?
Thanks!
You're correct in that it's all streaming and mostly only holding in-flight chunks in memory. For example, if you want to zip up 100GB stored across 100 files, you're not going to use anywhere close to 100GB of memory. You'd essentially be storing 100 bits of metadata (path, size, name, etc) and only pulling the bits of the files that are actively being compressed/streamed into memory while the zip is being generated.
As an example, I just ran a quick test against one of my applications that uses this library by asking it to zip up ~8TB of a mix of ~10,000 small and multi-GB files. The peak memory usage of the process (including the rest of my application) was ~30MB.
EDIT: For a more accurate test, I tested a minimal script that streamed ~20,000 files into what ended up being a 16.8TB zip file while monitoring memory usage using memray. Memray reported a peak memory usage of 17.494MB with 15.259TB of memory allocated all in total.