tkem/cachetools

support for file or database-backed cached?

Closed this issue · 3 comments

Would it be considered to add either file or, perhaps database (eg: redis) backed stores for the caches this package provides?
Quick iteration of coding often requires process restarts, and unless I've missed something in the docs, this means that all caches are lost.

tkem commented

There are several packages that provide cachetools-compatible file-based implementations.
See for example #128.
I've also heard about at least one using redis, but can't remember the details.

Why would this need an extention? Couldn't cachetools *Cache's just accept a dictionary 'store' argument? Right now it's hidden under _Cache_data.

Why would this need an extention? Couldn't cachetools *Cache's just accept a dictionary 'store' argument? Right now it's hidden under _Cache_data.

fwiw, here's how i wrapped a fsspec fs map

import cachetools as ct


def get_cache(typ='Cache', data_map=dict(), *a, **k):
    _ = getattr(ct, typ)
    _.__repr__ = lambda self: f"{typ}(data_map={data_map}, {a}, {k})" # https://github.com/tkem/cachetools/issues/227
    _ = _(*a, **k)
    _._Cache__data = data_map
    return _


def wrap_fsspecmap(fsmap, *serializion_args, **serialization_kwargs):
    import wrapt
    from cloudpickle import load, dump
    class FSSpecMap(wrapt.ObjectProxy):
        def __getitem__(self, *args, **kwargs):
            #_ = self.__wrapped__.__getitem__(*args, **kwargs)
            _ = self.__wrapped__.fs.open(self.__wrapped__.root+f'/{args[0]}')
            # idk maybe faster if i give cloudpicke a file instead of a bytes
            return load(_)

        def __setitem__(self, *args, **kwargs):
            k, v = args[0], args[1]
            k = self.__wrapped__.root+f'/{k}'
            f = self.__wrapped__.fs.open(k, 'wb')
            return dump(v, f, *serialization_kwargs, **serialization_kwargs)
            #return self.__wrapped__.__setitem__(*args, **kwargs)
    return FSSpecMap(fsmap)

seems to work.

I like how clean this library is and that it's a place for cache tools; namely caching algos, keymaps (and locking too!). You should be able to come to this lib and assemble a cache and memoizer that works for your purposes relying on other libraries for storage. There is so much overlap in this space.

I came here having used joblib.Memory for most of a year. It's getting messy IMO. I also didn't go for klepto.