Environment variable for `kExecutorPoolSize` in S3 Filesystem
jeongukjae opened this issue · 0 comments
It would be better if I could adjust the value of kExecutorPoolSize
with the environment variable as kS3MultiPartDownloadChunkSize
does.
For more background:
I recently found that my program (written in C++) using S3 filesystem from tensorflow/io rather than a local filesystem consumes more memory (about 1.3GB~1.5GB). And I'm pretty sure that this is because the transfer manager consumes that memory. (kS3MultiPartDownloadChunkSize
(50MB) * kExecutorPoolSize
(25 + 1) ~= 1.27GB, and maybe more memory for the threads?)
io/tensorflow_io/core/filesystems/s3/s3_filesystem.cc
Lines 47 to 62 in 2b8f277
To test my hypothesis, I set the env var S3_MULTI_PART_DOWNLOAD_CHUNK_SIZE=1024
and the memory utilization dropped as I expected.
Since I can wait for more seconds to download models and want to use a smaller memory footprint in my use case, I want to set both kS3MultiPartDownloadChunkSize
and kExecutorPoolSize
to smaller values.