Handle leak in parallel (nThread > 1) zstd decompression
jocbeh opened this issue · 5 comments
Describe the bug
Resources leak has been observed in blosc2_decompress_ctx for zstd with nthreads > 1.
It seems that thread resources are not properly released when zstd is decompressed in parallel.
- if nThreads == 1, everything is fine.
- if nThreads is > 1 each decompression increases the amount of used handles in the System.
Though the leak of a single decompression is quite small and neglectable, it leads inevitable to unacceptable behavior in long running application with very frequent blosc2_decompress_ctx calls.
Bug was initially found in blosc2 version 2.14.4 and is still reproducible in blosc2 version 2.15.1.
To Reproduce
Use blosc2_decompress_ctx with nthreads > 1 and zstd. Decompress lots of objects and observe handle/thread consumption.
Expected behavior
Threads are released.
Logs
System information:
- OS: Windows 10
- Compiler: Microsoft (R) C/C++ Optimizing Compiler Version
- Version: 19.38.33130
- Blosc version: 2.15.1
Additional context
Could you send a minimal, self-contained C program showing the leak? Also, if you can beat us and contribute a patch, that would be great. Thanks!
Hi @FrancescAlted,
Thank you very much for your fast reply!
Please find below a simple vs 2022 C++ application showing up the resources leak. Hope this will help you.
You can simply run the exe (BloscDecompressor.exe). It will decompress in an endless loop a zstd compressed file with nthreads=2.
If you need a version prior to vs 2022 please contact me.
Meantime, we identified that the handles are referencing thread objects which are not properly released in the application.
In contrast to a misleading statement in the original bug report the problem is independent from the compression format.
The problem also can be observed for LZ4HC. It seems that it is related to the context (blosc2_create_dctx) which is set for each blosc2_decompress_ctx call. I've adapted the bug report acoordingly.
We also see the likelihood of a wrong API usage on our side. We’re mainly focused on C# Development and we do not have any experiences in this kind of “compression” domain.
This is also the reason that we will not be able to contribute for a patch in an acceptable time frame.
Best Regards,
Jochen
I have used your program (see contexts3.c.zip), but valgrind is not reporting any leak:
$ valgrind examples/contexts3
==3386412== Memcheck, a memory error detector
==3386412== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3386412== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==3386412== Command: examples/contexts3
==3386412==
decompression took 614089.000000 ms to execute 11692800 bytes
decompression took 472141.000000 ms to execute 11692800 bytes
decompression took 427900.000000 ms to execute 11692800 bytes
decompression took 418579.000000 ms to execute 11692800 bytes
decompression took 436319.000000 ms to execute 11692800 bytes
decompression took 442640.000000 ms to execute 11692800 bytes
decompression took 432874.000000 ms to execute 11692800 bytes
decompression took 419834.000000 ms to execute 11692800 bytes
decompression took 427667.000000 ms to execute 11692800 bytes
decompression took 423952.000000 ms to execute 11692800 bytes
==3386412==
==3386412== HEAP SUMMARY:
==3386412== in use at exit: 0 bytes in 0 blocks
==3386412== total heap usage: 87 allocs, 87 frees, 101,355,106 bytes allocated
==3386412==
==3386412== All heap blocks were freed -- no leaks are possible
==3386412==
==3386412== For lists of detected and suppressed errors, rerun with: -s
==3386412== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
So, after 10 iterations there is no leak.
There should be another thing going on in your application.
I'm going to close this. In case you find another example showing the leak, please feel free to reopen.
Hi,
I'm not really familiar with valgrind. Does its also check if threads are correctly released/stopped after decompression?
We observe an increasing amount of threads (nthreads * number of decompression tasks).
Using your example and simply setting a break point after the decompression loop we see 20 threads in process explorer. Please refer to attached screenshot.
Maybe this is a windows specific Issue?
Would like to reopen this issue. However, it seems that I do not have the appropriate permissions.