ECMWFCode4Earth/ml_drought

Request to Use Axel instead of Wget for Exporters

v2thegreat opened this issue · 2 comments

Hey! I noticed that the download speed for the exporters was a bit slow compared to what we've seen be used in our pipeline. Have you considered using something like Axel that'll parallelize the downloads across multiple threads? I see that this is something that's already done here, but there is Pythonic overhead involved that might be better utilized somewhere else.

Looking at how you've done it in src/exporters/chirps.py, it seems that it should only require modifying this line to speed up the downloads with the correct configuration of axel to get the same results.

Finally, seeing as how downloading the data is an important part of the pipeline, it might help speed up the overall process substantially as the project grows to include other datasets as needed in the future.

Hi!

Axel seems very interesting - we'll take a look! We do want to minimize the amount of dependencies in the pipeline, so we might not integrate axel straight away.

Thank you!

This is really great of you to take an interest in the pipeline @v2thegreat ! Do you work with environmental data often? How would you like to use the pipeline?