pytest-dev/pytest

[doctests] Namespace packages (PEP 420) missing, leading to errors with relative imports

punitchandra opened this issue ยท 26 comments

When testing packages with implicit namespace and using --doctest-module the doc test module is imported without the implicit namespace.

This causes problems as the module can gets imported with different name. Once with the proper path and once with missing the namespace.

My pytest related versions:
pytest (3.5.0)
pytest-attrib (0.1.3)
pytest-cov (2.5.1)
pytest-forked (0.2)
pytest-timeout (1.2.1)
pytest-xdist (1.22.2)

OS is Windows 7.
Please look at the example:

foo.zip

GitMate.io thinks possibly related issues are #603 (monkeypatch does not work on already-imported function), #181 (--pdb does not work when collecting), #275 (doctest does not consider usefixtures in pytest.ini), #605 (tmpdir.join("foo").write(...) doesn't work as expected.), and #478 (--pyargs does not understand namespace packages).

Unfortunately pytest currently lacks support for namespace packages.

Related: #1927 and #478

#1927 is related to relative imports with implicit namespace. This is documented as not being supported here. This issue is with doctest and absolute imports in implicit namespace. It will help if it is clarified in the documentation that pytest does not support doctest for implicit namespace.
Also is there any plan to add this support?

Hi @punitchandra, there are no specific plans for this feature. We have very loose plans since we have very little man-power, unfortunately.

If you want to help out though, you are more than welcome. We would love to guide you through a PR if you have the time.

I believe the issue is illustrated by this exercise I just ran:

~ $ cd ~/draft
draft $ ls
draft $ mkdir pkg
draft $ touch pkg/__init__.py
draft $ cat > pkg/mod.py
"""
>>> print(__name__)
pkg.mod
"""
draft $ pip-run -q pytest -- -m pytest --doctest-modules
========================================================== test session starts ==========================================================
platform darwin -- Python 3.9.0, pytest-6.2.1, py-1.10.0, pluggy-0.13.1
rootdir: /Users/jaraco/draft
collected 1 item                                                                                                                        

pkg/mod.py .                                                                                                                      [100%]

=========================================================== 1 passed in 0.03s ===========================================================
draft $ rm pkg/__init__.py
draft $ pip-run -q pytest -- -m pytest --doctest-modules
========================================================== test session starts ==========================================================
platform darwin -- Python 3.9.0, pytest-6.2.1, py-1.10.0, pluggy-0.13.1
rootdir: /Users/jaraco/draft
collected 1 item                                                                                                                        

pkg/mod.py F                                                                                                                      [100%]

=============================================================== FAILURES ================================================================
_____________________________________________________________ [doctest] mod _____________________________________________________________
001 
002 >>> print(__name__)
Expected:
    pkg.mod
Got:
    mod

/Users/jaraco/draft/pkg/mod.py:2: DocTestFailure
======================================================== short test summary info ========================================================
FAILED pkg/mod.py::mod
=========================================================== 1 failed in 0.02s ===========================================================

This issue is exacerbated when one of the modules in the namespace matches the name of a built-in module. In jaraco/jaraco.email@e302051, I converted the project to use PEP 420 namespace packages, but the tests started failing with errors in collection:

__________________ ERROR collecting jaraco/email/messages.py ___________________
/opt/hostedtoolcache/Python/3.8.7/x64/lib/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1014: in _gcd_import
    ???
<frozen importlib._bootstrap>:991: in _find_and_load
    ???
<frozen importlib._bootstrap>:973: in _find_and_load_unlocked
    ???
E   ModuleNotFoundError: No module named 'email.messages'

The module is named jaraco.email.messages, but due to the failed detection of the namespace package, it looks for email.messages, in which email resolves to the stdlib module, which has no messages.

I'd really like to see a fix for this issue.

This issue is exacerbated when one of the modules in the namespace matches the name of a built-in module.

...any top level module.

+1 on this. I spent hours looking for a solution that doesn't require me to bloat up my tests directory with empty __init__.py files but there actually seems no way around it for now.

I'll pledge a $200 bounty for a proper fix for this issue (something that allows re-enabling doctests in the jaraco packages above).

@jaraco I verified that #10088 makes it possible to reenable doctests in your jaraco.site package, once it makes it into a release. It should work for your other projects too. You'll need to:

  • modify your tox.ini to specify pytest --import-mode=importlib {posargs}
  • explicitly specify testpaths=jaraco in your pytest.ini

If you're still offering that bounty, please donate it to https://www.thetrevorproject.org/

@alicederyn

- src
  - package
    -  subpackage1
      - __init__.py
    -  subpackage2
    -  subpackage3
      - sub-package3_1
        - __init__.py
# pytest.ini
[pytest]
addopts = --doctest-modules
testpaths=src
$ python -c 'import pytest; print(pytest.__version__)'
7.1.2
$ pytest --import-mode=importlib --doctest-modules
ERROR src/package/subpackage1/types.py - _pytest.pathlib.ImportPathMismatchError: ('types', '/home/gustavorps/.miniconda3/envs/conda-env/lib/python3.8/...
ERROR src/package/subpackage1/google/cloud/dataproc_v1/__init__.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/google/cloud/dataproc_v1/types.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/http/server.py - ImportError: attempted relative import with no known parent package
ERROR src/package/subpackage1/pipeline/module/http.py - _pytest.pathlib.ImportPathMismatchError: ('http', '/home/gustavorps/.miniconda3/envs/srcr...
ERROR src/package/subpackage1/pipeline/module/lib.py - ImportError: attempted relative import with no known parent package
ERROR src/package/subpackage1/pipeline/module/types.py - _pytest.pathlib.ImportPathMismatchError: ('types', '/home/gustavorps/.miniconda3/envs/sr...
ERROR src/package/subpackage1/pyspark/http_writer.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/pyspark/types.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/pyspark/module/__init__.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/pyspark/module/application.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/pyspark/module/cli.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/pyspark/module/udf.py - ModuleNotFoundError: No module named 'pyspark.sql'
ERROR src/package/subpackage1/pyspark/module/benchmark/cli.py - _pytest.pathlib.ImportPathMismatchError: ('cli', '/home/gustavorps/workspace/code...

@gustavorps I don't believe a new release of pytest has been cut yet with this fix in. If it has, you probably need to raise a new issue against the importlib setting I think? as all I did was enable it for doctests

You are right @alicederyn, the release cut was not done yet, but I will try to use the main branch this week

A try with latest main branch @alicederyn

$ pip install git+https://github.com/pytest-dev/pytest.git@main
...
$ python -c 'import pytest; print(pytest.__version__)'
7.2.0.dev249+g69f2855c
$ pytest --import-mode=importlib --doctest-modules
============================ test session starts =============================
platform linux -- Python 3.8.13, pytest-7.2.0.dev249+g69f2855c, pluggy-1.0.0
rootdir: /home/gustavorps/workspace/code/foo--ai--ontology-tagging-service/packages/foo-bar, configfile: pytest.ini, testpaths: src
plugins: anyio-3.6.1
collected 3 items / 4 errors                                                 

=================================== ERRORS ===================================
___ ERROR collecting src/foo/bar/google/cloud/dataproc_v1/__init__.py ____
src/foo/bar/google/cloud/dataproc_v1/__init__.py:46: in <module>
    from .types import *
E   ModuleNotFoundError: No module named 'src.foo.bar.google'; 'src.foo.bar' is not a package
______________ ERROR collecting src/foo/bar/http/server.py _______________
src/foo/bar/http/server.py:3: in <module>
    from ..pipeline.span import http as _span_http
E   ModuleNotFoundError: No module named 'src.foo.bar.pipeline'; 'src.foo.bar' is not a package
______ ERROR collecting src/foo/bar/pipeline/span/http.py ______
src/foo/bar/pipeline/span/http.py:8: in <module>
    from . import (
E   ImportError: cannot import name 'lib' from 'src.foo.bar.pipeline.span' (unknown location)
______ ERROR collecting src/foo/bar/pipeline/span/lib.py _______
src/foo/bar/pipeline/span/lib.py:14: in <module>
    from . import types as _types
E   ImportError: cannot import name 'types' from 'src.foo.bar.pipeline.span' (unknown location)
============================== warnings summary ==============================
../../../../../.miniconda3/envs/srcr-bar-v0/lib/python3.8/site-packages/marshmallow/__init__.py:17
  /home/gustavorps/.miniconda3/envs/srcr-bar-v0/lib/python3.8/site-packages/marshmallow/__init__.py:17: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    __version_info__ = tuple(LooseVersion(__version__).version)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================== short test summary info ===========================
ERROR src/foo/bar/google/cloud/dataproc_v1/__init__.py - ModuleNotFoundError: No module named 'src.foo.bar.google'; 'src.foo...
ERROR src/foo/bar/http/server.py - ModuleNotFoundError: No module named 'src.foo.bar.pipeline'; 'src.foo...
ERROR src/foo/bar/pipeline/span/http.py - ImportError: cannot import name 'lib' from 'src.foo.bar.pipeline.onto...
ERROR src/foo/bar/pipeline/span/lib.py - ImportError: cannot import name 'types' from 'src.foo.bar.pipeline.on...
!!!!!!!!!!!!!!!!!! Interrupted: 4 errors during collection !!!!!!!!!!!!!!!!!!!
======================== 1 warning, 4 errors in 2.23s ========================
pytest.ini
setup.py
โ””โ”€โ”€ src
    โ””โ”€โ”€ foo
        โ””โ”€โ”€ bar
            โ”œโ”€โ”€ google
            โ”‚   โ””โ”€โ”€ cloud
            โ”‚       โ””โ”€โ”€ dataproc_v1
            โ”‚           โ”œโ”€โ”€ __init__.py
            โ”‚           โ””โ”€โ”€ types.py
            โ”œโ”€โ”€ http
            โ”‚   โ”œโ”€โ”€ __init__.py
            โ”‚   โ””โ”€โ”€ server.py
            โ”œโ”€โ”€ pipeline
            โ”‚   โ”œโ”€โ”€ clustering
            โ”‚   โ”œโ”€โ”€ __init__.py
            โ”‚   โ””โ”€โ”€ ontologytagger
            โ”‚       โ”œโ”€โ”€ cli.py
            โ”‚       โ”œโ”€โ”€ http.py
            โ”‚       โ”œโ”€โ”€ __init__.py
            โ”‚       โ”œโ”€โ”€ lib.py
            โ”‚       โ”œโ”€โ”€ settings.py
            โ”‚       โ””โ”€โ”€ types.py
            โ”œโ”€โ”€ pyspark
            โ”‚   โ”œโ”€โ”€ http_writer.py
            โ”‚   โ”œโ”€โ”€ __init__.py
            โ”‚   โ”œโ”€โ”€ span
            โ”‚   โ”‚   โ”œโ”€โ”€ application.py
            โ”‚   โ”‚   โ”œโ”€โ”€ cli_args.py
            โ”‚   โ”‚   โ”œโ”€โ”€ cli.py
            โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
            โ”‚   โ”‚   โ”œโ”€โ”€ __main__.py
            โ”‚   โ”‚   โ””โ”€โ”€ udf.py
            โ”‚   โ”œโ”€โ”€ README.md
            โ”‚   โ””โ”€โ”€ types.py
            โ””โ”€โ”€ types.py
# pytest.ini
[pytest]
testpaths=src
# setup.py
import setuptools
import pathlib

requirements_path = pathlib.Path('./requirements')
requires = {}

for file in requirements_path.glob('*.txt'):
    with file.open('r') as f:
        requires[file.stem] = tuple(line.strip()
        for line in f.readlines()
            if not line.startswith('#'))
    requires['all'] = set(requirement
        for requirement_list in requires.values()
            for requirement in requirement_list)

install_requires = requires.pop('core')

setuptools.setup(
    name="foo-bar",
    version="0.0.0",
    description="foo bar package",
    packages=setuptools.find_namespace_packages(where='src'),
    install_requires=install_requires,
    extras_require=requires,
    entry_points = {
        'foo.pyspark.cli.application': [
            # INFO: the entry point name must be the same of
            # CommandLineInterface.typer.name
            'ontology-tagger-v1 = foo.bar.pyspark.spam.cli:CommandLineInterface',
        ],
        'foo.pyspark.types': [
            'foo.pyspark.types.bar = foo.bar.pyspark.types',
        ]
    }
)

You probably need an __init__.py in every directory from src/foo/bar on down. Could you upload this into a test repo somewhere?

If I put the __init__.py in every directory from src/foo/bar on down I will lost the PEP 420 โ€“ Implicit Namespace Packages mechanism intent for my packages.

I intend to create a repository to assist in the reproduction of the question, thank you in advance for your attention and patience @alicederyn.

If you want foo/bar to be an implicit namespace then you can't put types.py in it. You'll have to create a foo/bar/types package directory instead and put an __init__.py there.

Thanks alicederyn. I'm just getting back to this issue and excited to see some progress on it.

  • explicitly specify testpaths=jaraco in your pytest.ini

This requirement is inadequate for my goals. I'd like for pytest to honor namespace packages the same way Python does. That is, a developer shouldn't have to enact two steps to add a namespace package. If a developer doesn't have to add the paths for regular packages, they shouldn't have to add them for namespace packages. This requirement gives namespace packages a second-class experience... and it's not possible to write a generic template that honors namespace packages for all projects (one needs a separate test suite template for each namespace package they support).

Consider the jaraco/skeleton project, which provides a template for many projects I maintain. That template can't contain "jaraco" in it, because it supports CherryPy and irc and keyring (without namespace packages) but also backports.functools_lru_cache and configparser (with the backports namespace).

A better approach would be to require a namespace package to add a marker. Similar to __init__.py, a __nspkg__ file or some other convention, would be dramatically better. At least then, the test suite wouldn't necessarily need a customized configuration for each project and it re-uses the presence of the namespace package directory for existence.

But an even better approach would be for pytest to honor the package metadata to determine the top-level packages or simply to honor the Python heuristics for identifying top-level packages (a folder on sys.path is a package no matter what), even if that means being more rigorous about what sys.path should be or being loose about what other folders are inferred as packages.

If you want foo/bar to be an implicit namespace then you can't put types.py in it.

This expectation is a common misconception about namespace packages. Namespace packages are just like normal packages in that they can (and do) contain modules. Consider the jaraco.text package, in which the jaraco.text module is contained in the jaraco namespace. There's no constraint that a namespace package can contain only other packages. The only constraint on a PEP 420 namespace package is that it cannot contain an __init__ module.

In the referenced commit, I've attempted to implement testpaths for configparser, but it doesn't work as expected. Maybe I'm not using it correctly, but it produces an error:

No module named 'src.backports'; 'src' is not a package

It probably doesn't help that both backports.configparser and the top-level tests contain a compat module. Still, it would be nice if the tests could somehow recognize that src should be on sys.path and always import the packages/modules from there.

This expectation is a common misconception about namespace packages. Namespace packages are just like normal packages in that they can (and do) contain modules.

My understanding is that a namespace package will only be installed once, so if you put a file in it, it will only end up being installed if you happen to be first to be pip installed. [edit: my understanding was wrong]

Regardless, pytest's importlib mode has logic that ignore files in namespace modules, so the original comment still stands. [Edit: this should definitely be fact checked, it's probably wrong. I read a bunch of different code in a short space of time and I probably mixed up a bunch of repos]

If a developer doesn't have to add the paths for regular packages, they shouldn't have to add them for namespace packages.

It looks as if importlib mode will become the default at some point, so developers will have to add for both.

But an even better approach would be for pytest to honor the package metadata to determine the top-level packages or simply to honor the Python heuristics for identifying top-level packages (a folder on sys.path is a package no matter what)

I suggest you look for existing issues about this for importlib, or open new ones if they do not exist, so your ideas can be discussed.

I haven't forgotten about this issue. I do still intend to try the proposed fix on a jaraco* package, as that's where I'd reported it recently.

My understanding is that a namespace package will only be installed once, so if you put a file in it, it will only end up being installed if you happen to be first to be pip installed.

I am completely wrong about this! Mea culpa. Files in namespace packages appear to be fine provided they have a unique name.

I've confirmed that use of import-mode=importlib (and without testpaths) fixes the issue for jaraco.email and jaraco.functools, although those projects have a minimal test suite and don't fall afoul of the known drawback of importlib mode:

One drawback however is that test modules are non-importable by each other. Also, utility modules in the tests directories are not automatically importable because the tests directory is no longer added to sys.path.

That issue is apparent when applying the same change to jaraco.abode, where modular behaviors are no longer supported. And of course, if I apply import-mode=importlib technique to projects generally, it'll create problems in projects without namespace packages.

So until import-mode=importlib is a stable approach without major drawbacks, I'm left to adopt the change selectively (as more of a workaround than a robust solution).

In any case, this solution is a decent workaround and a definite improvement for those projects that couldn't run doctests at all, so I'll donate half the bounty to the indicated charity (and since it's a charity, my employer will match).

I plan to follow #8110, #7652, #10332, #10337, #10341, #10228, and others relating to import-mode=importlib.

Thank you, @jaraco ! ๐Ÿ™๐Ÿณ๏ธโ€๐ŸŒˆ