awslabs/s3-connector-for-pytorch

Can we get some benchmarks ?

tchaton opened this issue · 7 comments

Tell us more about this new feature.

Hey there,

It would be great to get some benchmarks about this dataset.

Hello @tchaton! Thank you for your interest in s3-connector-for-pytorch and for raising this. We will review your request as a team and get back to you with an update.

Hi @tchaton! We are working on providing a way for our customers to run benchmarks: #135. Any feedback is much appreciated. Thank you!

Sounds good @dnanuti. I will review it tomorrow.

Would you mind adding a comparaison with PyTorch Lightning Data: https://github.com/Lightning-AI/pytorch-lightning/tree/master/src/lightning/data?

Hi @tchaton, thanks for the suggestion. Our intention is to enable customers to run benchmarks on their own. Would incorporating PyTorch Lightning Data into our benchmarking framework fit your needs?

Hey dnanuti, I think enabling users to run their own benchmarks is great ! This is key and I am keen to try it out myself.

However, I think it would be great to see where this new library fits in the game of streaming libraries such as Lightning Data or WebDataset. Every single library uses Imagenet 1M without alteration for their benchmarks. I strongly recommend the s3-connector-for-pytorch Team to do the same.

Here are Lightning Data benchmarks for example: https://lightning.ai/lightning-ai/studios/benchmark-cloud-data-loading-libraries.

Furthermore, I believe there is room for improvement on Lightning Data by moving the backend to rust as this library does.

Note: Naming the client mountpoint_s3_client is quite confusing. This isn't really related to mountpoint-s3 mount solution.

Hey @tchaton!
Just checking in with a couple of updates:

Related to the note, the crate of the client we are using is actually published by mountpoint-s3. The naming is used to suggest the alignment with that solution. This crate is not intended for general-purpose use and we consider its interface to be unstable, as mentioned here.