vc1492a/tidd

Move data hosting to AWS S3 bucket and add data directory to gitignore

Closed this issue · 2 comments

While the data is not too large to store in the repository and pull directly from Github practically, we can significantly reduce the size of the library used to run the analyses and perform experiments by removing the data from the package that is distributed to Github and elsewhere.

I suggest we host the data in an AWS S3 bucket that's made public and then provide a curl command which can be used to download the data locally before running the experiments.

Uploaded the data! Will make sure folks are able to pull via curl next week and will update the instructions in the readme then.

Pushed as part of 5d90afa.