fatiando/pooch

allow silent, hashless pooch.retrieve

mathause opened this issue · 4 comments

Description of the desired feature:

Using a pooch registry it is possible to download files without passing a hash and get no warning, by setting the hash to None. However, I think sometimes it would be nice to do this as well using pooch.retrieve. This should definitely not be the default but a deliberate choice.

An example is to dynamically create the download url:

pooch.retrieve(f"{base_url}/{resolution}_{category}/{bname}.zip", None)

I abused pooch.create for this in my package but I think this is not optimal.


Option 1: Pass a sentinel value as hash, for example an Ellipsis ... or pooch.no_hash

pooch.retrieve(f"{base_url}/{resolution}_{category}/{bname}.zip", ...)
pooch.retrieve(f"{base_url}/{resolution}_{category}/{bname}.zip", pooch.no_hash)

where pooch.no_hash could be something along the lines of:

class _NoHash:
    def __repr__(self):
        return '<NoHash>'

no_hash = _NoHash()

(see also PEP0661).

Option 2: Add a silent keyword to pooch.retrieve (which defaults to False):

pooch.retrieve(f"{base_url}/{resolution}_{category}/{bname}.zip", None, silent=True)

The first option is way cooler but the second is probably better 😉

Are you willing to help implement and maintain this feature?

Yes, I'd be fine to implement this.

I just now saw #232 and it mentions that you can already silence the output, but I did not find this in the docs? Did I just not look hard enough?

Ok found it but you kind of have to know it to find it - so maybe this is more of a documentation issue? (Related: pity logger.setLevel is not a context manager).

@mathause passing known_hash=None in retrieve is already supported: https://www.fatiando.org/pooch/latest/api/generated/pooch.retrieve.html#pooch.retrieve Does this handle what you need?

The question of verbosity is something I've been trying to think of a way to do that without breaking code that relies on the current implementation (like @danshapero's). Ideally, we should have done this with simple print to STDERR and verbose=True flags from the start but 🤷🏽 .

The logging module doesn't really work well for us since it's not meant to be used within a library that's used in other libraries. It seemed like a perfect fit at the time but some problems have surfaced that we didn't foresee. A current problem is that if one of your dependencies silences Pooch logging, your package will also have it silenced.

I'll put down some thoughts on this in a separate issue.

Thanks for the answer. Yes, I knew pooch.retrieve(url, None) works but I wasn't able to do it silently (because I did not read your docs well enough...). Let's close this and I'll follow the discussion in #302.