Wrong __main__ module set in function reconstructed in child process
Closed this issue · 4 comments
Hello dear maintainers
I'm trying to use dill
to serialize functions I pass to child processes created using multiprocessing
. I does work well when the child process creation method is to fork the current process, but gives weird results when the method is to spawn a whole new python process.
Here is a minimal example that fails (Python 3.8.10, ubuntu 20)
import dill
import multiprocessing as mp
#import multiprocess as mp # multiprocess does not help, even though made to work with dill
def test_presence():
print('present !')
def job():
import sys
print('job')
print('__main__ is', sys.modules[__name__].__name__, sys.modules[__name__].__dict__.keys())
print('but got', __name__, globals().keys())
print()
test_presence()
def extract(dump):
dill.loads(dump)()
if __name__ == '__main__':
mp.set_start_method('spawn')
#mp.set_start_method('fork') # no problem with that one, but available on all platforms
process = mp.Process(target=extract, args=(dill.dumps(job),))
process.start()
It gives the following results:
job
__main__ is __mp_main__ dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__', '__builtins__', 'dill', 'mp', 'test_presence', 'job', 'extract'])
but got __main__ dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__annotations__', '__builtins__', 'spawn_main'])
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/ydejonghe/robot-cueillette/tests/test_dill_multiprocessing.py", line 17, in extract
dill.loads(dump)()
File "tests/test_dill_multiprocessing.py", line 14, in job
test_presence()
NameError: name 'test_presence' is not defined
You can see that function job
is dilled refering module __main__
, but when reconstructed in child process, it's creating a custom dictionnary to use as module '__main__'
because the child main module has been renamed '__mp_main__'
What I fail to understand is that the child processes still have a '__main__'
entry in sys.modules
, sodill
should be able to pick the right module for the reconstructed function
Using the 'fork'
child creation method does not rename the main module, so the issue does not occur:
job
__main__ is __main__ dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__annotations__', '__builtins__', '__file__', '__cached__', 'dill', 'multiprocessing', 'test_presence', 'job', 'extract', 'process'])
but got __main__ dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__annotations__', '__builtins__', '__file__', '__cached__', 'dill', 'multiprocessing', 'test_presence', 'job', 'extract', 'process'])
present !
The same problem occurs using multiprocess
instead of multiprocessing
Do you see any workaround for this ?
Thanks for the question, and investigating a bit.
This is a duplicate of uqfoundation/multiprocess#65, and it's due to differences in pickling across the different contexts. I don't yet have a good solution for the default dill serialization settings for spawn
... however, if you use the recurse=True
setting, it should work in your case.
import dill
import multiprocess as mp
dill.settings['recurse'] = True
...
This worked for me.
Python 3.8.15 (default, Oct 12 2022, 04:30:07)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.__version__
'0.3.6.dev0'
>>> import multiprocess
>>> multiprocess.__version__
'0.70.14.dev0'
Please close the issue if this answers your question.
Also see: #105, and other linked issues, for a longer discussion. I'm going to switch and close this as a duplicate.
Thanks for your fast answer ! (and all that material)
I investigated a bit more and found a potential solution. I'm posting it here, because I'm not sure the issue was the only problem in #115 and #105. I guess it could solve uqfoundation/multiprocess#65
I solved the problem by altering the exceptional behavior handling the main module in _dill.Unpickler
if (module, name) == ('__builtin__', '__main__'):
# formerly: self._main is not __main__ anymore
#return self._main.__dict__ #XXX: above set w/save_module_dict
# fix: get a reference to the last __main__ in date
import __main__
return __main__.__dict__
It seems that the 'spawn'
start method is reassigning the main module after complete initialization of the child process, hence the reference to module __main__
that dill
is storing at initialization (in _dill._main_module
and in self._main
afterward) is not good anymore
To check that, adding the following assertion before the above lines will raise:
assert self._main is __main__ or _main_module is __main__
I'm not sure this fix addresses the vast amount of cases dill is trying to handle. At least it fixes the minimal example in the above description. Do you think this could be an acceptable solution for this bug ?
Hmm.... very interesting. I like it. Yeah, that is a pretty good potential fix for the bug. I'd like to test it out against the dill
and multiprocess
test suites. Something like that I generally also test against klepto
and mystic
, as they have some very advanced serialization cases.
Thanks! Feel free to submit a PR.