[QST] Why GPU version of Zstd is slower than CPU version of Zstd

Question

[QST] Why GPU version of Zstd is slower than CPU version of Zstd

hsioamin opened this issue 2 years ago · 3 comments

[configuration]
CUDA Version: 11.4
OS in Docker container: Ubuntu 20.04.2 LTS
GPU card: A10 x 1
CPU: AMD EPYC 7413 24-Core Processor
Tensorflow version: 2.5.0+nv
nvcomp binary version: nvcomp_2.4.1_x86_64_11.x

[testing data]
image data size: 675 x 78

[Zstd decompression time]
On CPU (Meta API): 0.000069 s
On GPU (nvcomp API): 0.000038 s

[problem description]
I'm creating a Tensorflow custom OP that invokes Zstd decompression API of nvcomp, I simply copy & paste the low_level_quickstart_example in custom OP, so in this case I do both compression and decompression and this custom OP is launched by a python script. But I find that the function cudaStreamSynchronize() takes 0.014081 s which makes GPU version of Zstd is slower than CPU version of Zstd. May I ask what is wrong?

Thanks in advance!

Answer 1 · 2022-10-24T21:50:44.000Z

Hi @hsioamin.

Generally GPU decompression is less efficient unless you can provide a larger batch of data.

Where nvCOMP would be useful is if you had hundreds or thousands of such images that you could decompress in a batch. The API should take about the same amount of time, but we'd be able to hide the latency of decompressing many images behind the decompression of your first image.

Are you able to experiment with compressing / decompressing a batch of images? Please note nvCOMP 2.4 provided a batched compression API for zSTD as well.

Answer 2 · 2022-11-24T09:01:16.000Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Answer 3 · 2023-02-22T09:01:21.000Z

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.