Feature request: add option to warm-up country import cache
Closed this issue · 4 comments
Hello,
Thanks for this library, nice work!
I'm using it in a web application that receives lots of requests, and recently had to debug an issue of thread contention caused by the importlib.import_module
here: https://github.com/vacanza/python-holidays/blob/600ac6940a1d2112153773c9241d4f32066acedf/holidays/registry.py#L213
The threads were getting stuck trying to acquire the import lock, the call stack (obtained with py-spy
) looked like this:
Thread 1525 (active): "AnyIO worker thread"
_get_module_lock (<frozen importlib._bootstrap>:185)
__enter__ (<frozen importlib._bootstrap>:170)
_find_and_load (<frozen importlib._bootstrap>:1173)
_gcd_import (<frozen importlib._bootstrap>:1204)
import_module (importlib/__init__.py:126)
get_entity (holidays/registry.py:211)
__call__ (holidays/registry.py:193)
country_holidays (holidays/utils.py:187)
The fix for it was to call country_holidays
in my code at import time for every country i care about, thus avoiding the importlib.import_module
from being called from inside the request lifecycle as the entity
attribute will be already loaded.
What do you think of adding an option to do that by default?
For example, something like:
if os.getenv('PYTHON_HOLIDAYS_CACHE_WARMUP_ENABLED') == 'true':
# preload it here
Hi @eliasdorneles,
I know we don't have an issue template yet so I blame myself only for the fact I can't figure out what exactly caused the issue and what the preferred fix would be. In general, I'd avoid introducing env variable dependency but the PH code is totally open for contributions that fix existing users headaches.
Having said that, I'd like to request some clarification on this (multi-threading?) issue. It'd be ideal to have a PR (draft is okay) to understand the problem and assess the fix implementation approach.
Thank you!
Hello,
Right, I understand, I will try to get a minimal example that shows the problem.
I believe it's an issue related to importlib.import_module
not being thread-safe.
The web application I mentioned (which I unfortunately cannot share the code as-is) is a FastAPI app, with endpoints defined as "sync" (using regular def
and not async def
), so the requests are served by anyio worker threads.
I will come back with more details later!