tkem/cachetools

Ttl cache does not protect itself from KeyError when entry doesn't exist during expire()

brunoais opened this issue · 8 comments

Problem:

In multithreaded environments, there's a high chance that def expire(self, time=None) is called concurrently.
When that happens, there's a chance that cache_delitem (Cache.__delitem__) is called multiple times for the same element.
When that function is called multiple times, the operator ends up executing del over the cache Mapping which causes an undocumented KeyError on all but first call to clear the cache entry.

To reproduce:

  1. Create a function with many calls. For example:
from threading import Thread
import cachetools
import random
@cachetools.cached(cachetools.TTLCache(maxsize=50, ttl=1))
def will_not_be_called_sometimes(value):
	import time
	time.sleep(value / 5)
random.seed(5)
def run():
	while True:
		will_not_be_called_sometimes(random.randrange(1, 30))
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
Thread(target=run).start()
run()
  1. Run as much as needed until the exception starts to appear.

Note: In order to work, it might require running the program in a system with multiple CPU or CPU cores.

tkem commented

https://cachetools.readthedocs.io/en/latest/#cache-implementations

Please be aware that Cache and its subclasses are not thread-safe. Access to a shared cache from multiple threads must be properly synchronized, e.g. by using one of the memoizing decorators with a suitable lock object.

Ups. I failed to check that TtlCache extends Cache. Thanks.

tkem commented

Hmmm... maybe it should say "all cache classes in this module" or something like that instead "Cache and its subclasses". Would that be clearer? Suggestions welcome!

"all cache classes in this module"

Yeah, that is a good improvement (I'm also awful with these stuff). Thank you.

tkem commented

OK, I'll think about it. Thanks for your feedback - I thought the note in the docs was pretty clear, but obviously it's not :-)