indygreg/PyOxidizer

importlib.resource

abuckenheimer opened this issue · 4 comments

I'm piecing together a few parts of the documentation here so this may be a misunderstanding on my part but I'm having trouble getting the importlib.resource to act like I expect. Heres a trivial example I built (diff from a pyoxidizer init resource skeleton)

diff --git a/pyoxidizer.toml b/pyoxidizer.toml
index b401b2d..70297a4 100644
--- a/pyoxidizer.toml
+++ b/pyoxidizer.toml
@@ -78,10 +78,11 @@ type = "write-license-files"
 path = ""
 
 # Package .py files discovered in a local directory.
-# [[packaging_rule]]
-# type = "package-root"
-# path = "./src"
-# packages = ["foo", "bar"]
+[[packaging_rule]]
+type = "package-root"
+path = "./src"
+packages = ["foo"]
+include_source = true
 
 # Package things from a populated virtualenv.
 # [[packaging_rule]]
@@ -99,15 +100,16 @@ path = ""
 # Python interpreter is invoked, this section is not relevant.
 [[embedded_python_run]]
 # Run an interactive Python interpreter.
-mode = "repl"
+# mode = "repl"
 
 # Import a Python module and run it.
 #mode = "module"
 #module = "mypackage.__main__"
 
 # Evaluate some Python code.
-# mode = "eval"
-# code = "from mypackage import main; main()"
+mode = "eval"
+# code = "from foo import show_resources; show_resources('foo')"
+code = "from foo import show_resources; show_resources()"
 
 # Keeps track of which version of PyOxidizer is managing this project.
 # THIS SECTION IS MANAGED BY PYOXIDIZER AND SHOULD NOT BE CHANGED BY PEOPLE.
diff --git a/src/foo/__init__.py b/src/foo/__init__.py
new file mode 100644
index 0000000..6b3e5f5
--- /dev/null
+++ b/src/foo/__init__.py
@@ -0,0 +1,6 @@
+from importlib.resources import contents, read_text
+
+def show_resources(pkg=__package__):
+    print("package:", pkg)
+    print("contents:", contents(pkg))
+    print("text:\n", read_text(pkg, "__init__.py"))

so when I pyoxidizer run this I'd expect the following:

package: foo
contents: <list_iterator object at 0x7fab342760b8>
text:
from importlib.resources import contents, read_text

def show_resources(pkg=__package__):
    print("package:", pkg)
    print("contents:", contents(pkg))
    print("text:\n", read_text(pkg, "__init__.py"))

instead I get

package: 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "foo", line 5, in show_resources
  File "importlib.resources", line 248, in contents
  File "importlib.resources", line 47, in _get_package
  File "importlib", line 127, in import_module
  File "<frozen importlib._bootstrap>", line 1003, in _gcd_import
  File "<frozen importlib._bootstrap>", line 942, in _sanity_check
ValueError: Empty module name
error: cargo run failed

so package doesn't get set but maybe this is a known issue, but even if I specify the package in the embedded_python_run code = "from foo import show_resources; show_resources('foo')" above I still can't list the contents of the package

package: foo
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "foo", line 5, in show_resources
  File "importlib.resources", line 248, in contents
  File "importlib.resources", line 49, in _get_package
TypeError: 'foo' is not a package

which is also weird because its not consistent with another time I tried this where I could successfully call contents but it would return an empty tuple where as I would expect it to contain __init__.py. Apologies if I'm missing something obvious here, cool project though!

$ pyoxidizer -V
PyOxidizer 0.1.2
$ rustc --version
rustc 1.35.0 (3c235d560 2019-05-20)

I would not at all be surprised if there were bugs in the behavior of the module resources APIs! There are a few issues to unravel in here. I'll definitely look at this in more detail in the days ahead unless someone beats me to it.

Thank you for reporting your experience!

Thank you for the detailed reproduction case.

It looks like the behavior of importlib.resources.contents() and importlib.resources.read_text() are what you'd expect given a package name of ''. So that's good.

The underlying problem appears to be that __package__ is ''. It should be foo in this case. So if we fix __package__, importlib.resources should just work.

Lemme see about coding up a fix...

OK, things should now behave properly on the main branch. Here is the new output from your test case:

package: foo
contents: []
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "foo", line 6, in show_resources
  File "importlib.resources", line 169, in read_text
  File "importlib.resources", line 125, in open_text
FileNotFoundError: resource not found

I think this is arguably correct behavior. It is slightly different from what you get from the file-based importer because .py files aren't registered as resources by default with the in-memory importer: only non-module files are registered as resources. The file-based importer does a blind os.list() and open() to load things.

If you want to pick up Python module files as resources (it would be a legitimate feature request), we could make that a configurable setting via the packaging rules. Please file a new issue for that if wanted.

thanks for taking a look! will open a new issue