NVIDIA/nvcomp

[QST]zstd Decompression takes a lot of HBM

xingwenqiang opened this issue · 6 comments

hi,
I used nvcomp 2.3 to deal parquet files in my project, but compared to gzip decompression, zstd decompression used a lot of HBM resources that for gz.parquet decompress used 1.01G and zstd used 7.5G, is that true?, look forward your answers!

Hi Xing,

Please try the latest nvCOMP 2.6.1 and let us know how it goes. Are you decompressing parquet using cuDF?

Are you referring to HBM usage on GPU?

ZSTD decompression does use significant "scratch" memory on the GPU. Your application controls that allocation ( through the "computeScratch" APIs.). What sized scratch allocation is nvCOMP 2.6.1 requesting compared to the size of your input batch?

-Eric

thanks a lot, the answer is very helpful,I try nvCOMP 2.6.1 and API 'nvcompBatchedSnappyDecompressGetTempSizeEx' to calculate scratch size, then gpu memory reduce to 3.5G. After testing, I found that the scratch size is positively correlated with the zstd file size,that means I need to control zstd file size if I want to control gpu memory consumption。so if I want to further control the scratch size, do you have any good suggestions,looking forward your reply!

Hi Xing,

That's the correct API to use. How large is your file and how large is the scratch allocation? nvCOMP zstd does require scratch storage which is a multiple of the total size of the batch that you provide.

If the GPU memory requirement is still too large, you could decompress part of the batch in one call to the API and the next part of the batch in the next call.

Note, if you're trying to decompress one large ZSTD file, that currently won't work well in nvCOMP ZSTD. We rely on simultaneously processing a batch of many ZSTD buffers for good performance.

hi,

thanks for reply, it test good when we control the batch size in one call. As you say, the task is decompressing zstd.parquet using cuDF,we found that the parquet file page size is negative correlation with decompressing process when we optimize the job, is that true? looking forward your reply!

Hi Xing,

Making sure, are you using cuDF 23.02?

If I understand correctly, you're asking if it's expected for the decompression time to increase (get slower) as the parquet compression page size increases. This will be true until the file is very large (such that we have enough larger pages to fully saturate the GPU).

appreciate your help, its helpful to control the file size. I try to increase the rowgroup size and decrease page size,then the task works well.