scientific-python/lazy-loader

Missing re-exported imports in type stubs

Opened this issue · 7 comments

Consider a module like

foo/
    __init__.py
    __init__.pyi
    bar.py
    baz.py

Where __init__.py is

import lazy_loader as lazy

# this assumes there is a `.pyi` file adjacent to this module
__getattr__, __dir__, __all__ = lazy.attach_stub(__name__, __file__)

__init__.pyi is

from .bar import *
from .baz import *

https://typing.readthedocs.io/en/latest/source/stubs.html#imports says that * imports are re-exported. Therefore lazy.attach_stub should resolve the star imports from bar.py and baz.py. Instead foo.__all__ == [*].

Looking a little deeper, I think this happens in
https://github.com/scientific-python/lazy_loader/blob/2cb4343d9e2c2b71f412b2147bf34762b9e35e6f/lazy_loader/__init__.py#L296-L300

where alias.name == * doesn't trigger a lookup.

In trying to fix this I found a semi-viable solution

# Function to get importable names from a module
def get_importable_names(module_name):
    module = importlib.import_module(module_name)
    return getattr(module, "__all__", [name for name in dir(module) if not name.startswith('_')])

# Class to modify the AST to replace star-imports with explicit imports
class StarImportReplacer(ast.NodeTransformer):
    def visit_ImportFrom(self, node):
        if node.module and '*' in [alias.name for alias in node.names]:
            # Get list of importable names from the module
            names = get_importable_names(node.module)
            # Replace the star-import with explicit names
            node.names = [ast.alias(name=name, asname=None) for name in names]
        return node

This won't work as-is because get_importable_names actually imports the module. If we could have a version of get_importable_names that didn't import the module then this would work.

If we could have a version of get_importable_names that didn't import the module then this would work.

Perhaps the way would be to get the file corresponding to node.module, then parse that AST (recursively since it might have a *-import).

Here's something that mostly works.

Long code example
def _get_importable_names(filename, module_name):

    path = Path(filename).parent.joinpath(*module_name.split("."))
    if path.is_dir():
        path = path.joinpath("__init__.py")
    else:
        path = path.with_suffix(".py")
    module_code = path.read_text()

    parsed_code = ast.parse(module_code)
    names = []

    # Look for an assignment to __all__ in the module's AST
    for node in ast.walk(parsed_code):
        if isinstance(node, ast.Assign):
            for target in node.targets:
                if isinstance(target, ast.Name) and target.id == "__all__":
                    if isinstance(node.value, (ast.List, ast.Tuple)):
                        names = [
                            elt.s
                            for elt in node.value.elts
                            if isinstance(elt, ast.Constant)
                        ]
                        return names
                    else:
                        raise ValueError("__all__ must be a list or tuple of strings")

    if not names:
        msg = f"Module {path} does not define __all__ or it's not properly formatted"
        msg += module_code
        raise ValueError(msg)

    return names


class _StubVisitor(ast.NodeVisitor):
    """AST visitor to parse a stub file for submodules and submod_attrs."""

    def __init__(self, filename: str):
        self._filename = filename
        self._submodules = set()
        self._submod_attrs = {}

    def visit_ImportFrom(self, node: ast.ImportFrom):
        print("ImportFrom", node.module, [n.name for n in node.names])
        if node.level != 1:
            raise ValueError(
                "Only within-module imports are supported (`from .* import`)"
            )

        aliases = []
        for alias in node.names:
            if alias.name != "*":
                aliases.append(alias.name)
            else:
                aliases.extend(_get_importable_names(self._filename, node.module))

        if node.module:
            attrs: list = self._submod_attrs.setdefault(node.module, [])
            attrs.extend(aliases)
        else:
            self._submodules.update(aliases)

The issue is that _get_importable_names doesn't resolve cases where __all__ in a submodule is validly defined through additions, e.g.

__all__: list[str] = []
__all__ += ["foo"]
__all__ += bar.__all__

AFAIK the only way to do this robustly is to use a static type checker, e.g. mypy.api to correctly analyze the stub file.

Thanks for exploring these solutions, @nstarman!

@tlambert03 may have thoughts, otherwise I'll look into this a bit more.

I think @nstarman has explained it great. I can confirm that start imports are not implemented, and also that doing this robustly in a static way (without also making too many assumptions about the syntax used by the end-user) would indeed be pretty hard without going down the rabbit hole of rewriting a static type checker

I can imagine it being important enough to some users that you'd want to implement some fraction of it (such as @nstarman showed above), but it's probably important to call it relatively experimental, and ensure that users test stuff thoroughly if they are committed to using star imports with attach_stub