scikit-build/scikit-build-core

Entry points do not work for namespace packages

vyasr opened this issue · 4 comments

scikit-build-core produces the list of CMake paths by requesting the entry points from importlib and then converting those paths to strings. This works fine for normal packages where resources.files produces a PosixPath. However, this breaks down for namespace packages because resources.files returns a MultiplexedPath, which does not not transparently convert to a path when str is called since it could in principle represent many paths.

This example illustrates the issue:

from importlib import resources
from tempfile import TemporaryDirectory
import sys
import os

print("Namespace package")
with TemporaryDirectory() as tmpdir:
    sys.path.append(os.path.join(tmpdir, os.pardir))
    print(repr(resources.files(os.path.basename(tmpdir))))
    print(resources.files(os.path.basename(tmpdir)))

print("\nNormal package")
with TemporaryDirectory() as tmpdir:
    sys.path.append(os.path.join(tmpdir, os.pardir))
    with open(os.path.join(tmpdir, '__init__.py'), 'w') as f:
        f.write('')
    print(repr(resources.files(os.path.basename(tmpdir))))
    print(resources.files(os.path.basename(tmpdir)))

Outputs

(main) dt08% python test.py
Namespace package
MultiplexedPath('/tmp/tmp734gapja/../tmp734gapja')
MultiplexedPath('/tmp/tmp734gapja/../tmp734gapja')

Normal package
PosixPath('/tmp/tmpn4bpxjet/../tmpn4bpxjet')
/tmp/tmpn4bpxjet/../tmpn4bpxjet

I don't know the importlib interfaces terribly well, but this CPython issue has some suggestions for how to properly handle the output of resources.files in a more generic manner that might be helpful.

I think we could require importlib_resources >= 5.9.0 on Python < 3.12, and then use as_file to get the directory. This wasn't available when I first wrote this, I believe, I did try. ;)

from importlib import resources
from tempfile import TemporaryDirectory
import sys
import os

print("Namespace package")
with TemporaryDirectory() as tmpdir:
    sys.path.append(os.path.join(tmpdir, os.pardir))
    print(repr(resources.files(os.path.basename(tmpdir))))
    print(resources.files(os.path.basename(tmpdir)))
    with resources.as_file(resources.files(os.path.basename(tmpdir))) as f:
        print(f)

print("\nNormal package")
with TemporaryDirectory() as tmpdir:
    sys.path.append(os.path.join(tmpdir, os.pardir))
    with open(os.path.join(tmpdir, '__init__.py'), 'w') as f:
        f.write('')
    print(repr(resources.files(os.path.basename(tmpdir))))
    print(resources.files(os.path.basename(tmpdir)))
    with resources.as_file(resources.files(os.path.basename(tmpdir))) as f:
        print(f)

Alternatively, we could add specific handling for multiplexed path. The above will work for any case, including running from a .pyz, though it does add some extra work (copying files temporarily if this does represent multiple paths, I think), and we have to handle the context manager.

What does resources.files do if there are multiple files represented by the path? Do we need any special handling for the case where there are multiple importable directories with the same name (i.e. the intended use case for PEP 420-style namespace packages)?

I believe it makes a copy with the merged directory during the duration of the context manager, the same way it would expose a zip directory.