stgl/pymccrgb

JOSS review: Download and cache datasets

Closed this issue · 1 comments

Finally, I wonder whether it is essential to include a 21.6 MB example data which practically accounts for the totality of the size of the Python wheel or source tar. I believe that it would be better to include such example data in a separate repository (e.g., with the example notebooks there instead of in docs/source/examples) or hosted on an s3 instance so they are downloaded (and maybe cached locally) once the users call the load_ methods of the dataset module.

openjournals/joss-reviews#1777

I decided to keep the docs as-is, but now download and cache example data from S3.

A separate docs/examples repo would be good if this grows into a larger project with more diverse examples, I agree. Usually I prefer to have the docs and example nbs included in the main repo - I feel this makes it easier to browse for examples and maintain the docs.