S3FileLoaderIterDataPipe buffer_size
commonism opened this issue ยท 0 comments
commonism commented
๐ The doc issue
The default for S3 buffer size is 128 MB - or 128 * (1024**2)
The example for S3FileLoaderIterDataPipe uses a buffer_size of 256.
data/torchdata/datapipes/iter/load/s3io.py
Line 154 in a5b4720
Using a 256 bytes buffer degrades performance and allows the assumption buffer_size is provided in mbytes, as the example would double the 128 mbyte default.
Suggest a potential alternative/fix
document buffer_size to be in bytes and have the example use 256 * (1024**2) as value.