`NDArrayMeta.module` makes a very expensive, unused inspect call

Question

`NDArrayMeta.module` makes a very expensive, unused inspect call

sneakers-the-rat opened this issue a year ago · 1 comments

Problem

I have a module with ~several dozen Pydantic classes that have ~200 NDArray field annotations. Importing that module takes 21.15 seconds. 19.95s (94%) are spent in NDArrayMeta.__module__, nearly 100% of which is spent in its inspect.stack() call.

the module call does this:

@property
def __module__(cls) -> str:
    return cls._get_module(inspect.stack(), "nptyping.ndarray")

which in turn calls:

def _get_module(cls, stack: List[FrameInfo], module: str) -> str:
    # The magic below makes Python's help function display a meaningful
    # text with nptyping types.
    return "typing" if stack[1][3] == "formatannotation" else module

This seems to be an attempt to give a clean module name to inspect.formatannotation by returning "typing" which is special cased to be stripped out of the string representation used in help() (see python/cpython#72176 )

No matter whether importing, printing, or calling help(), the return of __module__ was always nptyping.ndarray, so the call appears to be entirely unused anyway.

Options

Replace with more constrained `inspect` call

If instead of getting the full stack, only the current and parent frame are inspected, the problem is resolved - imports are now ~1s which is entirely pydantic's overhead -- the __module__ call takes 0.08424s cumulative (for 591 calls).

so for NDArrayMeta

def __module__(cls) -> str:
    val = cls._get_module(inspect.currentframe(), "nptyping.ndarray")
    return val

and SubscriptableMeta

from types import FrameType
from inspect import getframeinfo

def _get_module(cls, stack: FrameType, module: str) -> str:
    # The magic below makes Python's help function display a meaningful
    # text with nptyping types.
    return "typing" if getframeinfo(stack.f_back).function == "formatannotation" else module

That's identical to the original stack call, so a 236x perf boost for free.

I had other solutions i was going to test but that one worked so well i didn't bother

Answer 1 · 2023-09-07T02:33:47.000Z

here's a drop-in monkeypatch function for anyone else this affects

def patch_npytyping():
    """
    npytyping makes an expensive call to inspect.stack()
    that makes imports of pydantic models take ~200x longer than
    they should:

    References:
        - https://github.com/ramonhagenaars/nptyping/issues/110
    """
    from nptyping import ndarray
    from nptyping.pandas_ import dataframe
    from nptyping import recarray
    from nptyping import base_meta_classes
    import inspect
    from types import FrameType

    # make a new __module__ methods for the affected classes
    def new_module_ndarray(cls) -> str:
        return cls._get_module(inspect.currentframe(), 'nptyping.ndarray')

    def new_module_recarray(cls) -> str:
        return cls._get_module(inspect.currentframe(), 'nptyping.recarray')

    def new_module_dataframe(cls) -> str:
        return cls._get_module(inspect.currentframe(), 'nptyping.pandas_.dataframe')

    # and a new _get_module method for the parent class
    def new_get_module(cls, stack: FrameType, module: str) -> str:
        return "typing" if inspect.getframeinfo(stack.f_back).function == "formatannotation" else module

    # now apply the patches
    ndarray.NDArrayMeta.__module__ = property(new_module_ndarray)
    recarray.RecArrayMeta.__module__ = property(new_module_recarray)
    dataframe.DataFrameMeta.__module__ = property(new_module_dataframe)
    base_meta_classes.SubscriptableMeta._get_module = new_get_module

Problem

Options

Replace with more constrained inspect call

I had other solutions i was going to test but that one worked so well i didn't bother

Replace with more constrained `inspect` call