`NDArrayMeta.__module__` makes a very expensive, unused inspect call
sneakers-the-rat opened this issue · 1 comments
Problem
I have a module with ~several dozen Pydantic classes that have ~200 NDArray
field annotations. Importing that module takes 21.15 seconds. 19.95s (94%) are spent in NDArrayMeta.__module__
, nearly 100% of which is spent in its inspect.stack()
call.
the module call does this:
@property
def __module__(cls) -> str:
return cls._get_module(inspect.stack(), "nptyping.ndarray")
which in turn calls:
def _get_module(cls, stack: List[FrameInfo], module: str) -> str:
# The magic below makes Python's help function display a meaningful
# text with nptyping types.
return "typing" if stack[1][3] == "formatannotation" else module
This seems to be an attempt to give a clean module name to inspect.formatannotation
by returning "typing" which is special cased to be stripped out of the string representation used in help()
(see python/cpython#72176 )
No matter whether importing, printing, or calling help()
, the return of __module__
was always nptyping.ndarray
, so the call appears to be entirely unused anyway.
Options
Replace with more constrained inspect
call
If instead of getting the full stack, only the current and parent frame are inspected, the problem is resolved - imports are now ~1s which is entirely pydantic's overhead -- the __module__
call takes 0.08424s cumulative (for 591 calls).
so for NDArrayMeta
def __module__(cls) -> str:
val = cls._get_module(inspect.currentframe(), "nptyping.ndarray")
return val
and SubscriptableMeta
from types import FrameType
from inspect import getframeinfo
def _get_module(cls, stack: FrameType, module: str) -> str:
# The magic below makes Python's help function display a meaningful
# text with nptyping types.
return "typing" if getframeinfo(stack.f_back).function == "formatannotation" else module
That's identical to the original stack
call, so a 236x perf boost for free.
I had other solutions i was going to test but that one worked so well i didn't bother
here's a drop-in monkeypatch function for anyone else this affects
def patch_npytyping():
"""
npytyping makes an expensive call to inspect.stack()
that makes imports of pydantic models take ~200x longer than
they should:
References:
- https://github.com/ramonhagenaars/nptyping/issues/110
"""
from nptyping import ndarray
from nptyping.pandas_ import dataframe
from nptyping import recarray
from nptyping import base_meta_classes
import inspect
from types import FrameType
# make a new __module__ methods for the affected classes
def new_module_ndarray(cls) -> str:
return cls._get_module(inspect.currentframe(), 'nptyping.ndarray')
def new_module_recarray(cls) -> str:
return cls._get_module(inspect.currentframe(), 'nptyping.recarray')
def new_module_dataframe(cls) -> str:
return cls._get_module(inspect.currentframe(), 'nptyping.pandas_.dataframe')
# and a new _get_module method for the parent class
def new_get_module(cls, stack: FrameType, module: str) -> str:
return "typing" if inspect.getframeinfo(stack.f_back).function == "formatannotation" else module
# now apply the patches
ndarray.NDArrayMeta.__module__ = property(new_module_ndarray)
recarray.RecArrayMeta.__module__ = property(new_module_recarray)
dataframe.DataFrameMeta.__module__ = property(new_module_dataframe)
base_meta_classes.SubscriptableMeta._get_module = new_get_module