cachedmethod not working as expected with locks
npezolano opened this issue ยท 4 comments
The below example runs, however it seems like it's not working as expected. In the below example calc
will get called multiple times for the same n when it should be locked by a thread and then cached, I tried using both Lock
and RLock
and get the same result. The cache does seem to be working if I rerun res = obj.multithreaded_run()
below everything will be cached, the problem looks to be if the function to be cached takes up to a few seconds to run and its required by multiple threads it won't lock properly. The problem gets worse if you use more threads. This is a follow up on the previous suggestion in #274
from multiprocessing.pool import ThreadPool as Pool
from threading import Lock, RLock
import cachetools
import operator
import time
import numpy as np
cores = 4
class DataObj(object):
def __init__(self, enable_cache=True, **kwargs):
self.enable_cache = enable_cache
self.lock = RLock()
self.cache = cachetools.LRUCache(maxsize=200)
@cachetools.cachedmethod(operator.attrgetter('cache'), lock=operator.attrgetter('lock'))
def calc(self, n): # Should only be called once for each n
print(f'running n : {n}')
sleeptime = np.random.uniform(0, 1)
time.sleep(sleeptime) # Sleep timer to simulate loading some data
return n
class RunnerObj(object):
def __init__(self, data_obj=None, window=60, size=100, **kwargs):
self.data_obj = data_obj
self.window = window
self.size = size
def run(self, data):
n = 0
for i in data:
n += self.data_obj.calc(i)
return n
def multithreaded_run(self):
seq = [i for i in range(self.size)]
window_size = self.window
data = []
# Create example window of data to run
for i in range(len(seq) - window_size + 1):
data.append(seq[i: i + window_size])
pool = Pool(cores)
res = pool.map(self.run, data)
return res
obj = RunnerObj(data_obj=DataObj())
res = obj.multithreaded_run()
7677e50 looks to fix the issues I'm facing above but causing a small performance impact
@npezolano: the title of this issue "not working as expected with locks" is something that I would have to admit is probably true, sorry.
However, it is AFAIK the same behavior the standard library's @lru_cache decorator shows, and any PRs trying to fix this have been rejected due to added complexity and runtime overhead. See #224 for a somewhat thorough discussion.
So, sorry, unless someone comes up with some truly ingenious idea, this will be left open.
On a final note: Yes, it's an issue, probably an issue that should be at least stated in the docs.
However, if this is a real issue for your individual use case, it's probably also an issue in your back-end service, i.e. this would arise even without caching (probably more so), and maybe should be handled there...