open() and close() are called way too many times
muusbolla opened this issue · 1 comments
I have a 1GB dataset that contains a lot of identical (zero) bytes. It compresses down to ~640KB. The dimensions are:
Element size: 1 byte
Entire data: 1024x1024x1024
Chunk: 64x64x64
Block: 32x16x64
I have a simple test that just calls caterva_open() followed by caterva_to_buffer() to decompress the entire 1GB buffer. After hacking caterva_open() to take a blosc2_io* instead of forcing BLOSC2_IO_DEFAULTS, and adding a counter to see how often the i/o functions were called, I was presented with see the following output:
decompressed 1073741824B in 1524ms
open() called 40967 times
close() called 40967 times
tell() called 0 times
seek() called 40963 times
write() called times
read() called 45064 times
trunc() called 0 times
Per my math, it looks like we call open()+close() 2x per chunk (8192 total) and 1x per block (32768 total), even if those blocks are tiny (in this case each block is less than 20B compressed). This is still relatively fast on my system which has Linux's cached in-memory file i/o, but if the file functions were replaced with anything that had significant latency (remote file access), this could make things slow to a crawl. We also call malloc() in every open and free() in every close, so there are 40k of those as well.
Consider opening the file and then leaving it open for the duration of the decompression call. If we need to access sparse regions of the file, stream/buffer them in order (sequential disk byte order).
Yes. I did an attempt to reduce that in the C-Blosc2 code (where file open/close is exercised). My attempt was at the optim_open_file branch, but as I did not get any noticeable speedup, I decided not to merge this. Feel free to experiment with this and tell us if you can get better performance than me.