Problem serializing instance of a class that uses a module
neucer opened this issue · 4 comments
I have a problem serializing an instance of a class that uses a module, something like:
import some_module
import dill
class SomeClass:
def __init__(self):
super().__init__()
def foo(self):
some_module.bar()
def main():
obj = SomeClass()
with open("path.pkl", "wb") as f:
dill.dump(obj, f)
if __name__ == "__main__":
main()
some_module
contains things that can't be serialized, and that's fine, I don't want to serialize it. But if I serialize with
dill.settings["recurse"] = False
, I get the error name 'some_module' is not defined when deserializing. If I serialize with dill.settings["recurse"] = True
, I get an error about the things in some_module
that can't be serialized.
I know as a workaround I can move the import some_module
into the foo
function, but I'm building a framework and I don't want to have to ask my users to do that. Also, Pickle
does not seem to have such an issue.
can you give an example that I can confirm? especially if pickle works, as you say. It seems that your example works for me, so I'd like to see where you are seeing a failure.
Python 3.8.17 (default, Jun 11 2023, 01:54:00)
[Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> import dill
>>> class SomeClass:
... def __init__(self):
... super().__init__()
... def foo(self):
... return math.sin(0)
...
>>> def main():
... obj = SomeClass()
... with open('path.pkl', 'wb') as f:
... dill.dump(obj, f)
...
>>> main()
>>> with open('path.pkl', 'rb') as f:
... print(dill.load(f).foo())
...
0.0
>>> dill.__version__
'0.3.8.dev0'
>>>
If you put this in some_module.py
:
import ctypes
# load an arbitrary dll
lib = ctypes.CDLL('C:/Windows/System32/msvcp100.dll')
def bar():
pass
this fails:
import some_module
import dill
class SomeClass:
def __init__(self):
super().__init__()
def foo(self):
some_module.bar()
def main():
obj = SomeClass()
dill.settings["recurse"] = True
with open("path.pkl", "wb") as f:
dill.dump(obj, f)
if __name__ == "__main__":
main()
with error Can't pickle <class '_ctypes.PyCFuncPtrType'>: it's not found as _ctypes.PyCFuncPtrType.
This succeeds:
import some_module
import pickle
class SomeClass:
def __init__(self):
super().__init__()
def foo(self):
some_module.bar()
def main():
obj = SomeClass()
with open("path.pkl", "wb") as f:
pickle.dump(obj, f)
if __name__ == "__main__":
main()
And I can't remove the dill.settings["recurse"] = True
because SomeClass
sometimes also uses globals.
If you use this file (I'm testing on a MacOS, not Windows):
# file: some_module.py
import ctypes
import ctypes.util
lib = ctypes.CDLL(ctypes.util.find_library('libc'))
def bar():
return 0.0
then do what you have above, I can reproduce the error with dill
, and no error with pickle
.
However, pickle
doesn't actually work...
Python 3.8.17 (default, Jun 11 2023, 01:54:00)
[Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> with open('path.pkl', 'rb') as f:
... obj = pickle.load(f)
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
AttributeError: Can't get attribute 'SomeClass' on <module '__main__' (<_frozen_importlib_external.SourceFileLoader object at 0x104541c70>)>
because the class is pickled by reference, and the reference is not present. Were you to put this into a file, and then install the file as a module that is globally available upon import, then pickle
would work. The difference is that dill
is serializing the class, and to do that, it needs to serialize the method, and the underlying function... which uses globals. dill
provides a few options to serialize the class (including byref
, which reproduces the behavior from pickle
). IF there's something that's not serializable in globals, then it's going to fail. I think the best solution is probably to suggest that users include the import in the function, so that the function doesn't rely on references to globals.
This is a known issue, so I'm going to close this as a duplicate... or you can confirm that pickle
performs differently than described above.
Feel free to reopen given my notes above