`_is_builtin_module` is wrong for environments managed by Spack
Closed this issue · 1 comments
When I invoke Parsl (even the simplest possible case), Parsl uses dill to serialize the function and arguments, which fails. Serializing the function and arguments somehow leads to serializing collections.abc
, which leads to serializing bytes_iterator
according to the Dill trace. This fails to serialize with the following stderr:
...snip...
File "/home/sam/Downloads/test/.spack-env/view/lib/python3.10/site-packages/dill/_dill.py", line 388, in save
StockPickler.save(self, obj, save_persistent_id)
File "/home/sam/Downloads/test/.spack-env/view/lib/python3.10/pickle.py", line 560, in save
f(self, obj) # Call unbound method with explicit self
File "/home/sam/Downloads/test/.spack-env/view/lib/python3.10/site-packages/dill/_dill.py", line 1711, in save_type
StockPickler.save_global(pickler, obj, name=obj_name)
File "/home/sam/Downloads/test/.spack-env/view/lib/python3.10/pickle.py", line 1071, in save_global
raise PicklingError(
_pickle.PicklingError: Can't pickle <class 'bytes_iterator'>: it's not found as builtins.bytes_iterator
bytes_iterator
is indeed a member of collections.abc), but the bigger problem is why is Dill trying to serialize builtin modules?. In fact, dill._dill._is_builtin_module(collections)
returns False
instead of True
when Python and Dill are installed by Spack.
>>> import dill, collections, sys, os
>>> dill._dill._is_builtin_module(collections)
False
>>> # This is incorrect; collections **is** builtin.
>>> collections.__file__
'/home/sam/Downloads/test/.spack-env/view/lib/python3.10/collections/__init__.py'
>>> sys.prefix
'/home/sam/Downloads/test/.spack-env/view'
>>> # So far, so good. collections.__file__.startswith(sys.prefix)
>>> os.path.realpath(collections.__file__)
'/home/sam/.local/share/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/python-3.10.8-wobwcruhfbzy5noyhl4vmvi2tuygyw6k/lib/python3.10/collections/__init__.py'
>>> os.path.realpath(sys.prefix)
'/home/sam/Downloads/test/.spack-env/._view/y3klaw6vrkdxyp23swulxprknwvfpsn6'
While collections.__file__
is within sys.prefix
, the realpath is not. This is because Spack manages Python environments by symlinking packages into a "view".
Here is a minimum working example:
apt update && apt install -y build-essential ca-certificates coreutils curl environment-modules gfortran git gpg lsb-release python3 python3-distutils python3-venv unzip zip
git clone -c feature.manyFiles=true https://github.com/spack/spack.git
source spack/share/spack/setup-env.sh
spack install python@3.10
spack install py-dill@0.3.5.1
# Note this also fails in Dill 0.3.6, but that is not yet in Spack's default package repo.
python3.10 -c 'import collections, dill; print(dill._dill._is_builtin_module(collections))"
# Prints False
python3.10 -c 'import collections, dill; dill.dumps(collections)"
# Errors with the above traceback.