tkem/cachetools

cache_clear() and cache_info() would be handy in regular decorators

Closed this issue · 11 comments

Just trying cachetools for the first time. I noticed that cache_clear and cache_info attributes are available only on those caches provided for backwards compatibility with functools.lru_cache(). Is there some reason they (or something like them) aren't available on functions memoized with @cached(...)?

tkem commented

cache_clear (and other operations on the actual cache object) can already be performed by assigning the cache to a variable, as shown in https://cachetools.readthedocs.io/en/latest/#cachetools.cached
I agree that a proper cache_clear method would be somewhat easier to use, and I've been thinking about (re-)adding it, though.
cache_info, OTOH, does have some performance impact (which may be negligible, depending on your application/use case), and I've actually found little use for that. So this would preferably something optional, which you could enable for debugging/testing, but I've not put much thought into this.

Got it. I missed that.

As for cache info I use info about caches in other contexts to make sure my intuitions about caching the output of a function was correct (or not). In my current example, I want to use a cache to reduce database queries. If the hit ratio isn't high enough, I probably haven't achieved my goal.

tkem commented

I'll be thinking about this, but don't expect anything soon...

In regards to the performance impact of cache_info. For debugging or testing, it would be reasonable to use a wrapper object, that would keep track of the counters, while delegating the logic to a real cache instance.

I'd be happy to see this too. I chose the package as the coding style seems disciplined (that's a rare compliment from me :) ), and have no intentions to turn away from it (and unlikely I would) - I just think at least for testing purposes, the ability to reset things (including caching) can be a useful convenience... I might be right in the middle of a use case :)

I'd also love to see some cache info tracked, so that the caller can easily see if the last call was a cache hit or a cache miss.

>>> @cachetools.func.lru_cache
... def count_vowels(sentence):
...     sentence = sentence.casefold()
...     return sum(sentence.count(vowel) for vowel in 'aeiou')
... 
>>> before = count_vowels.cache_info().hits
>>> count_vowels('abc')
1
>>> if count_vowels.cache_info().hits > before:
...     print("HIT")
... else:
...     print("MISS")
... 
MISS

I've come to this necessity because I was using the decorators (with cache_info, ...) but need to add a getsizeof override (I'm storing big objects and want to limit not on their count, but on their size).
I can't get both :-(
I can use a TTLCache and specify a getsizeof or a ttl_cache (and get the cache_info, but no getsizeof).
I'd expect having different access points (decorators, caches in a variable, …) but really using the same underneath code, so you could use each feature no matter how you instantiate the cache.

tkem commented

@oesteban-vx and everyone else: Thanks for nagging me ;-)

cachetools v5.3.0 addes cache_info() to the @cached decorator. Any feedback would be welcome!

Hi @tkem , I tried it and it worked great, but I did have to change:

@ttl_cache(maxsize=512, ttl=60 * 5)

Into

@cached(TTLCache(maxsize=512, ttl=60 * 5), info=True)

Might be nice to provide the new feature directly on the cacheutil.func decorators too?

tkem commented

@wimglenn: Huh? At least for me, this

from cachetools.func import ttl_cache

@ttl_cache(maxsize=512, ttl=60 * 5)
def fib(n):
     return n if n < 2 else fib(n - 1) + fib(n - 2)

print(fib(256))
print(fib.cache_info())

prints

141693817714056513234709965875411919657707794958199867
CacheInfo(hits=254, misses=257, maxsize=512, currsize=257)

with cachetools v5.3.0.

🤦 It was a problem between keyboard and chair. I went and added info=True to every cachetools decorator and got

TypeError: ttl_cache() got an unexpected keyword argument 'info'

nevermind!